Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fancier GC preserve intrinsics #23610

Merged
merged 1 commit into from
Sep 15, 2017
Merged

Fancier GC preserve intrinsics #23610

merged 1 commit into from
Sep 15, 2017

Conversation

Keno
Copy link
Member

@Keno Keno commented Sep 6, 2017

@vtjnash and @yuyichao complained that the gcuse intrinsic I introduced wasn't dce safe. This version should be.

@yuyichao
Copy link
Contributor

yuyichao commented Sep 6, 2017

👍 I think you also need to make sure the DCE in inference does not delete it.

e.g.

julia/base/inference.jl

Lines 3664 to 3669 in 508c9f5

if ((isa(expr, Expr) && is_meta_expr(expr)) ||
isa(expr, LineNumberNode))
# keep any lexically scoped expressions
i += 1
continue
end

src/codegen.cpp Outdated
Value **vals = (Value**)alloca(sizeof(Value *) * nargs);
for (size_t i = 0; i < nargs; ++i) {
if (!argv[i].isboxed) {
jl_error("Expr(:gc_preserve) only works on boxed values");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We allow this for ccall so we should also just ignore it here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't think so. This doesn't make sense on non-boxed values and people may be surprised, by e.g.:

n = 100000
@gc_preserve n do_something(pointer_from_objref(n))

otherwise. With this framework, we can also add a version of pointer_form_objref that gives you the pointer as well as a lifetime token, but @vtjnash seemed to think we should get rid of pointer_form_objref for non-obviously-boxed values anyway.

Copy link
Contributor

@yuyichao yuyichao Sep 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ccall version also doesn't guarantee boxing of isbits objects if it is passed in as a gc root, i.e. the following def is invalid.

cconvert(::Type{Ref{Int}}, v::Int) = v
unsafe_convert(::Type{Ref{Int}}, v::Int) = Ptr{Int}(pointer_from_objref(v))

I think this is not so different from your example.

This mainly makes it very hard to write generic unsafe code in a safe way. The most obvious usecase that I'd like to support is to implement an unsafe C function in julia and being able to replace

ccall(:c_vertion, Void, (Ptr{Void},), obj)

with

root = cconvert(Ptr{Void}, obj)
@gc_preserve root begin
    ptr = unsafe_convert(Ptr{Void}, root)
    julia_vertion(ptr)
end

And not having to worry about the type of root.

This also makes optimization in inference harder since we might want to split the root just like what we do for ccall.

Also, in any case, this should emit an error instead of throwing one from codegen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this framework, we can also add a version of pointer_form_objref that gives you the pointer as well as a lifetime token

You mean ((a = cconvert(Ref{Any}, v)), unsafe_convert(Ref{Any}, a))?

get rid of pointer_form_objref for non-obviously-boxed values anyway.

Having it raise an error is certainly doable. As lone as we still have a more hidden version for debugging like ccall(:jl_value_ptr, Ptr{Void}, (Any,), v)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mainly makes it very hard to write generic unsafe code in a safe way

I think Ref already should already provide a way to handle this case:

julia> Base.unsafe_convert(Ref{Any}, Base.cconvert(Ref{Any}, 1.0))
Ptr{Any} @0x0000000117b5c1a0

julia> unsafe_load(ans)
1.0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Ref already should already provide a way to handle this case:

Not when you don't know the type of the input and not when pointer_to_objref isn't what you want to get (and it's not what my code above is doing anyway).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we disallow pointer_from_objref in the problematic case, I'm fine with ignoring this error here. If not, I think I'd be worried that people expect that
@gc_preserve x use(pointer_from_objref(x)) will keep the pointer returned from pointer_from_objref valid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between this and the unsafe_convert code I give? Both of them are writing code for unsafe code so the user should be already warned of possible issue and should read the doc carefully about the semantics.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll remove the error.

base/pointer.jl Outdated
s, r = gensym(), gensym()
esc(quote
$s = $(Expr(:gc_preserve, syms...))
$r = $(args[end])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to either catch (and raise error) or support control flow out of this block?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on what you mean. Control flow out of the block should work fine. Whether the thing will still be rooted then will depend on dominance in general.

@ararslan ararslan added the GC Garbage collector label Sep 6, 2017
@yuyichao yuyichao added the compiler:codegen Generation of LLVM IR and native code label Sep 6, 2017
@yuyichao
Copy link
Contributor

yuyichao commented Sep 6, 2017

Actually no inference.jl DCE should be fine. I just wasn't sure if this is following control flow or lexical order (@inbounds is the latter) I guess we can just document that it is not allowed to jump in or out of the @gc_preserve block but it'll still be nice if we can enforce that...

base/pointer.jl Outdated
end
s, r = gensym(), gensym()
esc(quote
$s = $(Expr(:gc_preserve, syms...))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

local s, r

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this maybe be called :gc_preserve_begin to pair with :gc_preserve_end?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't care too much. It is called begin/end at the llvm level. I had a vague notion that there was a meaningful difference, but I'm not sure I want to spend too much time thinking about it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems sensible to match the LLVM names. I would guess that the difference is that the LLVM ones have to be paired while you can maybe have the Julia-level :gc_preserve expression without a paired :gc_preserve_end and have things still work (I'm guessing at that). Matching LLVM still seems sensible to me even with that difference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll just rename it.

(The `llvm.` in the name is required in order to be able to use the `token`
type). The semantics of these intrinsics are as follows:
At any safepoint that is dominated by a `gc_preserve_begin` call, but that does
not dominate a corresponding `gc_preserve_end` call (i.e. a call whose argument
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should "but that does not dominate a corresponding gc_preserve_end call" be "and is not dominated by a corresponding gc_preserve_end call"? I would think the preserve region is between them, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right.

@Keno Keno force-pushed the kf/fancygcpreserve branch from 69d2b61 to 7db327e Compare September 9, 2017 18:05
@Keno Keno force-pushed the kf/fancygcpreserve branch from 7db327e to d68e42f Compare September 9, 2017 18:19
@Keno
Copy link
Member Author

Keno commented Sep 9, 2017

Rebased, removed error for unboxed values, renamed :gc_preserve to gc_preserve_begin and fixed the documentation

@Keno
Copy link
Member Author

Keno commented Sep 11, 2017

Any more comments?

@StefanKarpinski
Copy link
Member

Mergie, mergie?

@Keno Keno merged commit 08d2f70 into master Sep 15, 2017
@yuyichao yuyichao deleted the kf/fancygcpreserve branch September 15, 2017 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code GC Garbage collector
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants