问题描述
我有一个算法,它接收一个大小为 (in_x,in_y)
的二维数组,然后输出另一个大小为 (out_x,out_y)
的二维数组。
out_x <= in_x
和 out_y < in_y
这意味着在宏观意义上有一些积累。该算法本身是按照一些物理方程分阶段完成的简单移位累加,其中中间结果存储在 3D 数组中。
我想确定 in
的哪些元素对 out
的每个元素都有贡献。
一个类似的等式
out[i,j] = sum_{m,n} in[m,n]
告诉我每个 m,n
的 i,j
是什么。
MRE
对于二维数组 Input=ones ((8,16))
的输入,我记录了发生的所有计算:
Iteration=0: State.shape=(8,4,16)
Iteration=0: State[:,:] = Input
Iteration=0: iter=1 State[:,1,1:] = State[:,1:] + Input[:,:-1]
Iteration=0: iter=2 State[:,2,2:] = State[:,2:] + Input[:,:-2]
Iteration=1: new State.shape=(4,16) old State.shape=(8,16)
Iteration=1: iter=0 new.State[0,16:16] = old.State[0,16:16]
Iteration=1: iter=0 new.State[0,0:16] = old.State[0,0:16] + old.State[1,0:16]
Iteration=1: iter=1 new.State[0,15:16] = old.State[0,15:16]
Iteration=1: iter=1 new.State[0,0:15] = old.State[0,0:15] + old.State[1,1:16]
Iteration=1: iter=2 new.State[0,15:16]
Iteration=1: iter=2 new.State[0,1:16]
Iteration=1: iter=0 new.State[1,16:16] = old.State[2,16:16]
Iteration=1: iter=0 new.State[1,0:16] = old.State[2,0:16] + old.State[3,0:16]
Iteration=1: iter=1 new.State[1,15:16] = old.State[2,15:16]
Iteration=1: iter=1 new.State[1,0:15] = old.State[2,0:15] + old.State[3,1:16]
Iteration=1: iter=2 new.State[1,15:16]
Iteration=1: iter=2 new.State[1,1:16]
Iteration=1: iter=0 new.State[2,16:16] = old.State[4,16:16]
Iteration=1: iter=0 new.State[2,0:16] = old.State[4,0:16] + old.State[5,0:16]
Iteration=1: iter=1 new.State[2,15:16] = old.State[4,15:16]
Iteration=1: iter=1 new.State[2,0:15] = old.State[4,0:15] + old.State[5,1:16]
Iteration=1: iter=2 new.State[2,15:16]
Iteration=1: iter=2 new.State[2,1:16]
Iteration=1: iter=0 new.State[3,16:16] = old.State[6,16:16]
Iteration=1: iter=0 new.State[3,0:16] = old.State[6,0:16] + old.State[7,0:16]
Iteration=1: iter=1 new.State[3,15:16] = old.State[6,15:16]
Iteration=1: iter=1 new.State[3,0:15] = old.State[6,0:15] + old.State[7,1:16]
Iteration=1: iter=2 new.State[3,15:16]
Iteration=1: iter=2 new.State[3,1:16]
Iteration=1: iter=3 new.State[3,3,14:16] = old.State[6,14:16]
Iteration=1: iter=3 new.State[3,0:14] = old.State[6,0:14] + old.State[7,2:16]
Iteration=2: new State.shape=(2,5,16) old State.shape=(4,16)
Iteration=2: iter=0 new.State[0,16:16]
Iteration=2: iter=0 new.State[0,0:16]
Iteration=2: iter=1 new.State[0,15:16]
Iteration=2: iter=1 new.State[0,1:16]
Iteration=2: iter=2 new.State[0,15:16]
Iteration=2: iter=2 new.State[0,1:16]
Iteration=2: iter=3 new.State[0,14:16] = old.State[0,14:16]
Iteration=2: iter=3 new.State[0,0:14] = old.State[0,0:14] + old.State[1,2:16]
Iteration=2: iter=4 new.State[0,14:16]
Iteration=2: iter=4 new.State[0,2:16]
Iteration=2: iter=0 new.State[1,16:16]
Iteration=2: iter=0 new.State[1,0:16]
Iteration=2: iter=1 new.State[1,15:16]
Iteration=2: iter=1 new.State[1,1:16]
Iteration=2: iter=2 new.State[1,15:16]
Iteration=2: iter=2 new.State[1,1:16]
Iteration=2: iter=3 new.State[1,14:16] = old.State[2,14:16]
Iteration=2: iter=3 new.State[1,0:14] = old.State[2,0:14] + old.State[3,2:16]
Iteration=2: iter=4 new.State[1,14:16]
Iteration=2: iter=4 new.State[1,2:16]
Iteration=3: new State.shape=(1,8,16) old State.shape=(2,16)
Iteration=3: iter=0 new.State[0,16:16]
Iteration=3: iter=0 new.State[0,0:16]
Iteration=3: iter=1 new.State[0,16:16]
Iteration=3: iter=1 new.State[0,0:16]
Iteration=3: iter=2 new.State[0,15:16]
Iteration=3: iter=2 new.State[0,1:16]
Iteration=3: iter=3 new.State[0,15:16]
Iteration=3: iter=3 new.State[0,1:16]
Iteration=3: iter=4 new.State[0,14:16]
Iteration=3: iter=4 new.State[0,2:16]
Iteration=3: iter=5 new.State[0,14:16]
Iteration=3: iter=5 new.State[0,2:16]
Iteration=3: iter=6 new.State[0,6,13:16] = old.State[0,13:16]
Iteration=3: iter=6 new.State[0,0:13] = old.State[0,0:13] + old.State[1,3:16]
Iteration=3: iter=7 new.State[0,7,13:16]
Iteration=3: iter=7 new.State[0,3:16]
Final output=
[[8. 8. 8. 8. 8. 8. 8. 8.]
[8. 8. 8. 8. 8. 8. 8. 8.]
[8. 8. 8. 8. 8. 8. 8. 8.]
[8. 8. 8. 8. 8. 8. 8. 8.]
[8. 8. 8. 8. 8. 8. 8. 8.]
[8. 8. 8. 8. 8. 8. 8. 8.]
[8. 8. 8. 8. 8. 8. 8. 8.]
[9. 9. 9. 9. 9. 9. 9. 9.]]
但是请注意,最后一个 State shape=(1,16)
只需要前 8 列。
尝试 1
我尝试搜索计算图,但我找到的所有资源都用于深度学习。我可以使用执行 autograd 的张量,但我想问社区是否有专门的工具可以做到这一点。
最终目标
我想加快操作速度。一种方法是使用 numba
之类的东西对代码进行 JIT 编译,但这将不可避免地考虑分支和中间 3D 数组。
我最终会使用这组方程来编写内在代码(仅包含加载/存储/累加)。
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)