Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent nullable test failure #17662

Closed
tkelman opened this issue Jul 28, 2016 · 7 comments
Closed

Intermittent nullable test failure #17662

tkelman opened this issue Jul 28, 2016 · 7 comments
Assignees
Labels
heisenbug This bug occurs unpredictably test This change adds or pertains to unit tests
Milestone

Comments

@tkelman
Copy link
Contributor

tkelman commented Jul 28, 2016

From worker 10:      * nullable             Test Failed
    From worker 10:   Expression: isa(convert(Nullable{Number},one(T)),Nullable{Number})
<or sometimes isa(convert(Nullable{Any},one(T)),Nullable{Any})>
<snip>
Exception running test nullable :
On worker 10:
LoadError: There was an error during testing
 in record at ./test.jl:397
 in do_test at ./test.jl:281
 in macro expansion; at /tmp/julia/share/julia/test/nullable.jl:285 [inlined]
 in anonymous at ./<missing>:?
 in include_string at ./loading.jl:380
 in include_from_node1 at ./loading.jl:429
 in macro expansion at ./util.jl:180 [inlined]
 in runtests at /tmp/julia/share/julia/test/testdefs.jl:7
 in #16 at /tmp/julia/share/julia/test/runtests.jl:44

e.g. https://travis-ci.org/JuliaLang/julia/jobs/147705996
https://travis-ci.org/JuliaLang/julia/jobs/147770860
https://travis-ci.org/JuliaLang/julia/jobs/147796256
https://travis-ci.org/JuliaLang/julia/jobs/147915601
https://ci.appveyor.com/project/JuliaLang/julia/build/1.0.4714/job/hd5ctt5loet610ai

Reproducible (eventually, takes maybe a dozen or so times to occur) via

for i in `seq 40`; do
  JULIA_CPU_CORES=1 ./julia test/runtests.jl read pollfd grisu nullable || break
done

cc @yuyichao who I think has caught this in rr

@tkelman tkelman added the heisenbug This bug occurs unpredictably label Jul 28, 2016
@vtjnash vtjnash added this to the 0.5.0 milestone Jul 28, 2016
@tkelman
Copy link
Contributor Author

tkelman commented Jul 28, 2016

Even running 40 times the bisect results are a bit tough to pin down, I'll go back to the same commit later and things will look fine where they previously failed after 10-15 repeats. This may depend on some non-reproducible object ID's or pointers that are getting saved in the system image and making the sys.so build non-deterministic, so maybe doing make clean && make -j4 && JULIA_CPU_CORES=1 ./julia test/runtests.jl read pollfd grisu nullable || break inside the loop would be better for really determining when this was introduced.

@yuyichao
Copy link
Contributor

who I think has caught this in rr

Right. Thanks @tkelman for providing the list of tests to reproduce this. It's reproducible with ~10% probability without rebuilding on current master.

The direct issue seems to be that Nullable{Any} (also Nullable{Union{}}) appears twice in the type cache causing the pointer comparison to fail.

Before the second one is inserted the type cache looks like.

(rr) p jl_(tn->cache)
svec(Base.Nullable{DataType}, Base.Nullable{Symbol}, Base.Nullable{SimpleVector}, Base.Nullable{Void}, Base.Nullable{Expr}, Base.Nullable{LambdaInfo}, Base.Nullable{Task}, Base.Nullable{Float16}, Base.Nullable{Float32}, Base.Nullable{Float64}, Base.Nullable{Bool}, Base.Nullable{Char}, Base.Nullable{Int8}, Base.Nullable{UInt8}, Base.Nullable{Int16}, Base.Nullable{UInt16}, Base.Nullable{Int32}, Base.Nullable{UInt32}, Base.Nullable{Int64}, Base.Nullable{UInt64}, Base.Nullable{Int128}, Base.Nullable{UInt128}, Base.Nullable{Base.##12#13}, Base.Nullable{Base.Cstring}, Base.Nullable{Base.##80#82}, Base.Nullable{Base.AbstractIOBuffer{Array{UInt8, 1}}}, Base.Nullable{Base.RefValue{Int32}}, Base.Nullable{Base.RefValue{UInt64}}, Base.Nullable{Base.Libc.TmStruct}, Base.Nullable{Base.##19#20{Tuple{Symbol, Array{Symbol, 1}, Symbol, Symbol, Symbol}}}, Base.Nullable{Base.##19#20{Tuple{Symbol, Array{Symbol, 1}, Symbol, Symbol}}}, Base.Nullable{Base.Condition}, Base.Nullable{Base.Cmd}, Base.Nullable{Base.Process}, Base.Nullable{Base.ProcessChain}, Base.Nullable{Base.##315#316}, Base.Nullable{Base.VersionNumber}, Base.Nullable{Base.RRID}, Base.Nullable{Base.RemoteValue}, Base.Nullable{Base.WorkerPool}, Base.Nullable{Base.RefValue{UInt16}}, Base.Nullable{Base.Test.Pass}, Base.Nullable{Base.Test.Fail}, Base.Nullable{Base.Test.Error}, Base.Nullable{Base.LineEdit.InputAreaState}, Base.Nullable{Base.FastMath.##1#2}, Base.Nullable{Base.LibGit2.GitIndex}, Base.Nullable{Base.LibGit2.GitAnnotated}, Base.Nullable{Base.Pkg.Types.VersionInterval}, Base.Nullable{Base.Dict{Base.VersionNumber, Int64}}, Base.Nullable{Base.Pkg.Resolve.PkgToMaxSumInterface.##2#5}, Base.Nullable{Base.Pkg.Resolve.PkgToMaxSumInterface.##3#6}, Base.Nullable{Base.Dates.##40#41{Base.Dates.SlotRule}}, Base.Nullable{Base.Markdown.MD}, Base.Nullable{Base.KeyIterator{Base.Dict{String, String}}}, Base.Nullable{Base.Math.##31#34{Int64}}, Base.Nullable{Base.##19#20{Tuple{String, Array{String, 1}, String}}}, Base.Nullable{Base.Pkg.Resolve.MaxSum.##12#17{Int64, Array{Int64, 1}}}, Base.Nullable{Base.Math.##1#2{Int64, Int64}}, Base.Nullable{Base.RefValue{Base.LibGit2.CloneOptions}}, Base.Nullable{Base.RefValue{UInt32}}, Base.Nullable{Base.##19#20{Tuple{Array{String, 1}, String, String}}}, Base.Nullable{Base.Math.##25#28{Int64}}, Base.Nullable{NullableTestEnum.TestEnum}, Base.Nullable{Base.Dict}, Base.Nullable{Base.LibGit2.AbstractPayload}, Base.Nullable{Exception}, Base.Nullable{Union{}}, Base.Nullable{AbstractString}, Base.Nullable{IO}, Base.Nullable{Array}, Base.Nullable{Any}, Base.Nullable{Integer}, #<null>, #<null>, #<null>, #<null>, #<null>, #<null>, #<null>, #<null>, #<null>, #<null>, #<null>, #<null>, #<null>, #<null>, #<null>, #<null>, #<null>)

In particular, some of the order seems broken.

(rr) p dt_compare(&data[69], &data[70])
$21 = -1
(rr) p dt_compare(&data[70], &data[71])
$22 = 1
(rr) p jl_(data[69])
Base.Nullable{IO}
$23 = void
(rr) p jl_(data[70])
Base.Nullable{Array}
$24 = void
(rr) p jl_(data[71])
Base.Nullable{Any}
$25 = void

Haven't figured out how that happened yet.

@yuyichao
Copy link
Contributor

And the wrong order is created after deserialization so it might be related to 735316e ? @JeffBezanson

@yuyichao
Copy link
Contributor

Or 9922032

@vtjnash
Copy link
Member

vtjnash commented Jul 28, 2016

#17405 was merged 2 days, ago, so not 9922032?

@vtjnash
Copy link
Member

vtjnash commented Jul 28, 2016

min repro?

julia> object_id(Nullable{Base.Dict})
0x53d5e3720e642a71
-------------- restart julia --------------
julia> object_id(Nullable{Base.Dict})
0xa830a0533ba5fe4b
### !!!CORRUPTION!!! ###

(which does sound an awful lot like 9922032#diff-757551c8929efe195c9bab04f9925d05L1065, since the TypeVar's after restarting aren't egal to the same TypeVar's before restarting, that'll corrupt the cache)

@JeffBezanson
Copy link
Member

Yeah, we can back out that part of the change.

@JeffBezanson JeffBezanson self-assigned this Jul 28, 2016
@kshyatt kshyatt added the test This change adds or pertains to unit tests label Jul 28, 2016
@KristofferC KristofferC reopened this Jul 28, 2016
JeffBezanson added a commit that referenced this issue Jul 28, 2016
fix #17662, failure in nullable test due to type cache corruption
mfasi pushed a commit to mfasi/julia that referenced this issue Sep 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
heisenbug This bug occurs unpredictably test This change adds or pertains to unit tests
Projects
None yet
Development

No branches or pull requests

6 participants