问题描述
我正在尝试与进程Pool
共享一个大型3维numpy数组,以便对所述大型数组的切片执行某些操作。
在我的main
中:
_dtype = np.dtype('float64')
n_rotations,n_coords,n_points = 7000,3,25600
shm = shared_memory.SharedMemory(
create=True,size=n_rotations * n_coords * n_points * _dtype.itemsize)
rotations_name = shm.name
coordinates = np.ndarray(
(n_rotations,n_points),dtype=_dtype,buffer=shm.buf)
coordinates = rotations @ ellipsoid
print(coordinates.shape) # outputs (n_rotations,n_points)
chunks = [(rot_idx,rotations_name,args.output,(n_rotations,max_rad)
for rot_idx in range(n_rotations)]
pool = Pool(args.processes)
_res = pool.starmap_async(gen_features,chunks).get()
gen_features
的定义如下:
def gen_features(idx: int,buf_name: str,_dir: str,rot_dims: tuple,max_rad: int):
shm = shared_memory.SharedMemory(name=buf_name)
rotations = np.ndarray(rot_dims,dtype=np.dtype('float64'),buffer=shm.buf)
print(rotations) # here the np array has become zero-filled for some reason
del rotations,_
shm.close()
return idx
解决方法
花费了将近一个小时的调试之后,事实证明,您必须“复制”数据,如this部分所述:
b[:] = a[:] # Copy the original data into shared memory
基本上,这
coordinates[:] = rotations @ ellipsoid
代替
coordinates = rotations @ ellipsoid