两个有条件的numpy数组的笛卡尔积 基准化

问题描述

给出下面带有w和x的 1d 数组,我可以使用以下代码形成笛卡尔积。

import numpy as np

w = np.array([1,2,3,4])
x = np.array([1,4])

V1 = np.transpose([np.repeat(w,len(x)),np.tile(x,len(w))])
print(V1)

[[1 1]
 [1 2]
 [1 3]
 [1 4]
 [2 1]
 [2 2]
 [2 3]
 [2 4]
 [3 1]
 [3 2]
 [3 3]
 [3 4]
 [4 1]
 [4 2]
 [4 3]
 [4 4]]

但是,我希望输出V1包括仅w 的数组行(如下所示)。我可以使用循环来做到这一点,但我希望能更快一些。

[[1 2]
 [1 3]
 [1 4]
 [2 3]
 [2 4]
 [3 4]]

解决方法

方法1

给出ONLY array rows where w < x(用于成对组合),这是实现相同的一种方法-

In [81]: r,c = np.nonzero(w[:,None]<x) # or np.less.outer(w,x)

In [82]: np.c_[w[r],x[c]]
Out[82]: 
array([[1,2],[1,3],4],[2,[3,4]])

方法2

使用纯基于遮罩的方法,它将是-

In [93]: mask = np.less.outer(w,x)

In [94]: s = (len(w),len(x))

In [95]: np.c_[np.broadcast_to(w[:,None],s)[mask],np.broadcast_to(x,s)[mask]]
Out[95]: 
array([[1,4]])

基准化

使用相对较大的数组:

In [8]: np.random.seed(0)
   ...: w = np.random.randint(0,1000,(1000))
   ...: x = np.random.randint(0,(1000))

In [9]: %%timeit
   ...: r,None]<x)
   ...: np.c_[w[r],x[c]]
11.3 ms ± 24.3 µs per loop (mean ± std. dev. of 7 runs,100 loops each)

In [10]: %%timeit
    ...: mask = np.less.outer(w,x)
    ...: s = (len(w),len(x))
    ...: np.c_[np.broadcast_to(w[:,s)[mask]]
10.5 ms ± 275 µs per loop (mean ± std. dev. of 7 runs,100 loops each)

In [11]: import itertools

# @Akshay Sehgal's soln
In [12]: %timeit [i for i in itertools.product(w,x) if i[0]<i[1]]
105 ms ± 1.38 ms per loop (mean ± std. dev. of 7 runs,10 loops each)
,

尝试使用itertools这一行方法-

import itertools

w = np.array([1,2,3,4])
x = np.array([1,4])

[i for i in itertools.product(w,x) if i[0]<i[1]]
[(1,2),(1,3),4),(2,(3,4)]

Itertools非常快速且内存高效。它应该很快。

,

您还可以通过以下方式过滤自己的解决方案:

V1 = np.transpose([np.repeat(w,len(x)),np.tile(x,len(w))])
V1[V1[:,0]<V1[:,1]]

@Divakar提出的解决方案当然是更快的,因为它不会进行多余的计算。


使用@Divakar的Benchit软件包进行基准测试:

#@Proposed solution here
def m1(x):
  V1 = np.transpose([np.repeat(w,len(w))])
  return V1[V1[:,1]]

#@Divakar's approach 1
def m2(x):
  r,x)
  return np.c_[w[r],x[c]]

#Divakar's approach 2
def m3(x):
  mask = np.less.outer(w,x)
  s = (len(w),len(x))
  return np.c_[np.broadcast_to(w[:,s)[mask]]

in_ = [np.arange(n) for n in [10,100,1000]]
w = x.copy()

enter image description here

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...