如何使用 Dask 和 Numexpr 优化 numpy python 代码

问题描述

我有一个代码,我想在不改变其结构的情况下进行优化。我的代码不是很清楚,但我只需要固定它。对不起,如果我做错了什么,因为我是菜鸟。

1.我需要固定的原始代码

import numpy as np
import time
N = 3*10**3
data_1 = np.arange(N)
m = np.arange(N*N)
ind_1 = np.arange(N//2)

# Part of the code for optimizing
for i in range(10):
    t1 = time.time()
    data = np.tile(data_1,N)
    m_square = m.reshape(N,N)
    M_square = m_square * data_1
    d = np.sum(M_square,axis = 1)
    data_1 = data_1 + d * 2.0
    ind_2 = np.where(data != 0.0)[0]
    m_ind = m[ind_2]
    data_ind = data[ind_2]
    m[ind_2] = m_ind - data_ind * 2.0
    m[ind_1] = 0.0
    t1 = time.time() - t1
    print(t1)

2.使用 Numexpr 进行优化(快一点):

import numexpr as ne
N = 10
data_1 = np.arange(N)
m = np.arange(N*N)
ind_1 = np.arange(N//2)

for i in range(10):
    t2 = time.time()
    data = np.tile(data_1,N)
    M_square = ne.evaluate('m_square * data_1')
    d = np.sum(M_square,axis = 1)
    data_1 = ne.evaluate('data_1 + d * 2.0')
    ind_2 = np.where(data != 0.0)[0]
    m_ind = m[ind_2]
    data_ind = data[ind_2]
    m[ind_2] = ne.evaluate('m_ind - data_ind * 2.0')
    m[ind_1] = 0.0
    t2 = time.time() - t2
    print(t2)

3.使用 dask 延迟优化(较慢):

from dask.delayed import delayed
N = 10
data_1 = np.arange(N)
m = np.arange(N*N)
ind_1 = np.arange(N//2)

for i in range(10):
    t3 = time.time()
    data = delayed(np.tile)(data_1,N)
    M_square = delayed(m_square * data_1)
    d = delayed(np.sum)(M_square,axis = 1)
    data_1 = delayed(data_1 + d * 2.0)
    ind_2 = np.where(data.compute() != 0.0)[0]
    m_ind = delayed(m)[ind_2]
    data_ind = data[ind_2]
    m[ind_2] = (m_ind - data_ind * 2.0).compute()
    m[ind_1] = 0.0
    t3 = time.time() - t3
    print(t3)

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)