如何在列表理解中并行化功能并保持顺序

问题描述

我有一个2d数组(不同长度)的列表,我需要通过列表理解将某些函数有效地应用于这些数组。

由于这还不够快,因此列表理解需要并行化。

要做到这一点,又要保持切片(或“子数组”)的顺序的正确方法是什么?

def get_slice_max(arr): 
     '''
     get the slice,but replace every element with the maximum value that has occoured till(including) the iter so far.
     ''' 
     result = [arr[0]] 
     for i in range(1,len(arr)):  
         result.append(max(result[-1],arr[i])) 
     return result

result  = [get_slice_max(slice_)  for slice_ in a]

可重现的样品:

a = [ np.array(range(1,random.randint(3,8))) for x in range(10000)]

编辑: 我需要像这样的列表理解并行处理:

temp = np.random.randint(1,high=100,size=10) # determines the sizes of the subarrays
A,B,C =  [ np.randint(0,high=1,size=x) for x in temp],[ np.random.uniform(size=x) for x in temp],[ np.random.uniform(size=x) for x in temp]
result = [ [y if x==1 else z for x,y,z in zip(a,b,c)] 
              for  a,c  in zip(A,C,) ]

temp = np.random.randint(1,size=10) # determines the sizes of the subarrays
D,E = [ np.random.uniform(size=x) for x in temp],[ np.randint(0,size=x) for x in temp]
[ [ x/y for x,y in zip(d,np.maximum.accumulate(get_slice_max(e))] for d,e in zip(D,E) ] 

解决方法

使用numpy.maximum.accumulate

# Sample
a = [np.random.randint(1,10,np.random.randint(3,8)) for _ in range(10000)]
a[:3]
# [array([4,5,6]),array([7,2,8,9,5]),array([5,1,7,5])]

[np.maximum.accumulate(arr) for arr in a]

输出:

[array([4,9]),7])]

验证:

all(np.array_equal(get_slice_max(arr),np.maximum.accumulate(arr)) for arr in a)
# True

基准测试(快6倍):

%timeit [np.maximum.accumulate(arr) for arr in a]
# 6.07 ms ± 498 µs per loop (mean ± std. dev. of 7 runs,100 loops each)
%timeit [get_slice_max(arr) for arr in a]
# 32.4 ms ± 11 ms per loop (mean ± std. dev. of 7 runs,10 loops each)

相关问答

依赖报错 idea导入项目后依赖报错,解决方案:https://blog....
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下...
错误1:gradle项目控制台输出为乱码 # 解决方案:https://bl...
错误还原:在查询的过程中,传入的workType为0时,该条件不起...
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct...