朱莉娅:与远程工作者的SharedArray变为一个零元素数组

问题描述

我正在尝试在要与Julia 1.5.3上的本地工作人员结合的服务器上使用远程工作人员运行一些代码。在本地由24个工作人员运行时,以下代码可以正常工作:

using distributed
using SharedArrays
a = SharedArray{Float64}(100)
@sync @distributed for i = 1:100
    a[i] = i+1
end
sum(a)

如果我添加工作人员

N_remote = 24
for i=1:N_remote
    addprocs(["user@192.168.0.129"],tunnel=true,dir="/home/user/scripts/",exename="/home/user/julia-1.5.3/bin/julia")
end

然后在运行第一个代码时出现以下错误

 julia> include("test_sharedarray.jl")
ERROR: LoadError: TaskFailedException:
On worker 4:
BoundsError: attempt to access 0-element Array{Float64,1} at index [1]
setindex! at ./array.jl:847 [inlined]
setindex! at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/SharedArrays/src/SharedArrays.jl:510
macro expansion at /home/usuaris/spcom/gfebrer/bayesian_mc_watson/scripts/test_sharedarray.jl:5 [inlined]
#13 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/distributed/src/macros.jl:301
#160 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/distributed/src/macros.jl:87
#103 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/distributed/src/process_messages.jl:290
run_work_thunk at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/distributed/src/process_messages.jl:79
run_work_thunk at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/distributed/src/process_messages.jl:88
#96 at ./task.jl:356

...and 23 more exception(s).

Stacktrace:
 [1] sync_end(::Channel{Any}) at ./task.jl:314
 [2] (::distributed.var"#159#161"{var"#13#14",UnitRange{Int64}})() at ./task.jl:333
Stacktrace:
 [1] sync_end(::Channel{Any}) at ./task.jl:314
 [2] top-level scope at task.jl:333
 [3] include(::String) at ./client.jl:457
 [4] top-level scope at REPL[5]:1
in expression starting at /home/user/scripts/test_sharedarray.jl:4

解决方法

SharedArrays仅在单个群集节点内工作。 换句话说,这用于在同一服务器上运行的进程之间共享RAM内存。 当显然添加另一台服务器时,您将看不到该内存。

您应该使用DistributedArrays.jl代替

using Distributed,DistributedArrays
addprocs(2)
@everywhere using DistributedArrays
a=dzeros((3,4),workers())
@sync @distributed for i = 1:nworkers()
    a_part = localpart(a) 
    vec(a_part) .= (1:length(a_part)) .+ 1000*myid()
end

现在让我们看看a

julia> a
3×4 DArray{Float64,2,Array{Float64,2}}:
 2001.0  2004.0  3001.0  3004.0
 2002.0  2005.0  3002.0  3005.0
 2003.0  2006.0  3003.0  3006.0