Skip to content

KeyError: key Dagger not found #509

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
droodman opened this issue May 6, 2024 · 3 comments · Fixed by #510
Closed

KeyError: key Dagger not found #509

droodman opened this issue May 6, 2024 · 3 comments · Fixed by #510

Comments

@droodman
Copy link

droodman commented May 6, 2024

Just copied an example from the documentation in a new Julia session...

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.10.3 (2024-04-30)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using Distributed, Dagger

julia> addprocs(4);

julia> X = Dagger.@shard myid()
ERROR: On worker 2:
KeyError: key Dagger [d58978e5-989f-55fb-8d15-ea34adc7bf54] not found
Stacktrace:
  [1] getindex
    @ .\dict.jl:498 [inlined]
  [2] macro expansion
    @ .\lock.jl:267 [inlined]
  [3] root_module
    @ .\loading.jl:1878
  [4] deserialize_module
    @ C:\Users\drood\.julia\juliaup\julia-1.10.3+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Serialization\src\Serialization.jl:994
  [5] handle_deserialize
    @ C:\Users\drood\.julia\juliaup\julia-1.10.3+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Serialization\src\Serialization.jl:896
  [6] deserialize
    @ C:\Users\drood\.julia\juliaup\julia-1.10.3+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Serialization\src\Serialization.jl:814
  [7] deserialize_datatype
    @ C:\Users\drood\.julia\juliaup\julia-1.10.3+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Serialization\src\Serialization.jl:1398
  [8] handle_deserialize
    @ C:\Users\drood\.julia\juliaup\julia-1.10.3+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Serialization\src\Serialization.jl:867
  [9] deserialize
    @ C:\Users\drood\.julia\juliaup\julia-1.10.3+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Serialization\src\Serialization.jl:814
 [10] handle_deserialize
    @ C:\Users\drood\.julia\juliaup\julia-1.10.3+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Serialization\src\Serialization.jl:874
 [11] deserialize
    @ C:\Users\drood\.julia\juliaup\julia-1.10.3+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Serialization\src\Serialization.jl:814 [inlined]
 [12] deserialize_msg
    @ C:\Users\drood\.julia\juliaup\julia-1.10.3+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Distributed\src\messages.jl:87
 [13] #invokelatest#2
    @ .\essentials.jl:892 [inlined]
 [14] invokelatest
    @ .\essentials.jl:889 [inlined]
 [15] message_handler_loop
    @ C:\Users\drood\.julia\juliaup\julia-1.10.3+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Distributed\src\process_messages.jl:176
 [16] process_tcp_streams
    @ C:\Users\drood\.julia\juliaup\julia-1.10.3+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Distributed\src\process_messages.jl:133
 [17] #103
    @ C:\Users\drood\.julia\juliaup\julia-1.10.3+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Distributed\src\process_messages.jl:121
Stacktrace:
  [1] remotecall_fetch(::Function, ::Distributed.Worker; kwargs::@Kwargs{})
    @ Distributed C:\Users\drood\.julia\juliaup\julia-1.10.3+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Distributed\src\remotecall.jl:465
  [2] remotecall_fetch(::Function, ::Distributed.Worker)
    @ Distributed C:\Users\drood\.julia\juliaup\julia-1.10.3+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Distributed\src\remotecall.jl:454
  [3] remotecall_fetch
    @ C:\Users\drood\.julia\juliaup\julia-1.10.3+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Distributed\src\remotecall.jl:492 [inlined]
  [4] OSProc(pid::Int64)
    @ Dagger C:\Users\drood\.julia\packages\Dagger\5F8wE\src\processor.jl:110
  [5] iterate
    @ .\generator.jl:47 [inlined]
  [6] collect_to!
    @ .\array.jl:892 [inlined]
  [7] collect_to_with_first!
    @ .\array.jl:870 [inlined]
  [8] _collect(c::Vector{Int64}, itr::Base.Generator{Vector{…}, Type{…}}, ::Base.EltypeUnknown, isz::Base.HasShape{1})
    @ Base .\array.jl:864
  [9] collect_similar
    @ .\array.jl:763 [inlined]
 [10] map
    @ .\abstractarray.jl:3285 [inlined]
 [11] Context
    @ C:\Users\drood\.julia\packages\Dagger\5F8wE\src\context.jl:34 [inlined]
 [12] eager_context()
    @ Dagger.Sch C:\Users\drood\.julia\packages\Dagger\5F8wE\src\sch\eager.jl:9
 [13] shard(f::Any; procs::Nothing, workers::Nothing, per_thread::Bool)
    @ Dagger C:\Users\drood\.julia\packages\Dagger\5F8wE\src\chunks.jl:185
 [14] shard(f::Any)
    @ Dagger C:\Users\drood\.julia\packages\Dagger\5F8wE\src\chunks.jl:180
 [15] top-level scope
    @ C:\Users\drood\.julia\packages\Dagger\5F8wE\src\chunks.jl:223
Some type information was truncated. Use `show(err)` to see complete types.
@JamesWrigley
Copy link
Collaborator

I think the issue is that Dagger is loaded before the workers are added, if you load it afterwards it works:

            _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.11.0-beta1 (2024-04-10)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using Distributed

julia> addprocs(4)
4-element Vector{Int64}:
 2
 3
 4
 5

julia> using Dagger

julia> X = Dagger.@shard myid()
Dagger.Shard(Dict{Dagger.Processor, Dagger.Chunk}(OSProc(1) => Dagger.Chunk{Int64, MemPool.DRef, OSProc, ProcessScope}(Int64, UnitDomain(), MemPool.DRef(1, 16, 0x0000000000000008), OSProc(1), ProcessScope: worker == 1, false), OSProc(2) => Dagger.Chunk{Int64, MemPool.DRef, OSProc, ProcessScope}(Int64, UnitDomain(), MemPool.DRef(2, 0, 0x0000000000000008), OSProc(2), ProcessScope: worker == 2, false), OSProc(3) => Dagger.Chunk{Int64, MemPool.DRef, OSProc, ProcessScope}(Int64, UnitDomain(), MemPool.DRef(3, 0, 0x0000000000000008), OSProc(3), ProcessScope: worker == 3, false), OSProc(4) => Dagger.Chunk{Int64, MemPool.DRef, OSProc, ProcessScope}(Int64, UnitDomain(), MemPool.DRef(4, 0, 0x0000000000000008), OSProc(4), ProcessScope: worker == 4, false), OSProc(5) => Dagger.Chunk{Int64, MemPool.DRef, OSProc, ProcessScope}(Int64, UnitDomain(), MemPool.DRef(5, 0, 0x0000000000000008), OSProc(5), ProcessScope: worker == 5, false)))

AFAIK this is a limitation of Distributed.jl rather than Dagger itself. Where did you did see that example in the docs?

@droodman
Copy link
Author

droodman commented May 6, 2024

Ah, yes, that does fix it.

But I think it points up a gap in the documentation. The example is from the documentation in the sense that the line of interest, X = Dagger.@shard myid() is on the quick start page. I wanted to try it in the Julia session, so I did what seemed the obvious thing to me. myid() is in DIstributed, so I loaded that with using. While I was at it, I loaded Dagger. Then I ran addprocs(). Then I ran the command of interest. It crashed. I thought, oh I guess Dagger is not worth the trouble. More of a quick end than a quick start!

If it is is easy to get a crash when using Dagger then I think how to avoid that should be prominent on the quick start page. Put another way, there isn't a complete example on the quick start page that includes the using commands and whatever other setup is needed. Or is it possible for Dagger to detect the condition that causes the crash and provide a helpful message?

@JamesWrigley
Copy link
Collaborator

Yeah that's fair, I added some docs about it in #510.

Or is it possible for Dagger to detect the condition that causes the crash and provide a helpful message?

I don't think this is possible in Dagger itself, it would need to be added in Distributed. What's happening is that the master process is executing some code (like Dagger.@shard) that serializes Dagger objects and sends them to the workers, but if the workers don't have Dagger loaded they see a name like Dagger in the object type and cannot deserialize the object because they don't know anything about the Dagger module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants