-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault in LuaTable::set #18
Comments
Valgrind stacktrace (take this with a grain of salt, it's optimized after all):
Instruction at fault:
I suspect that |
hmmmm, that's concerning, it's ONLY on the current nightly compiler? Do you have a reproduction I could look at? |
It's also at least on yesterday's nightly, but I neither have the bandwidth nor the disk space to bisect. The reproduction is just to run EDIT: I'm on x64 Linux, for reference |
The easiest way to cause this: let lua = Lua::new();
lua.globals().set("bla", LuaNil).unwrap(); It happens with any value, not just |
It doesn't seem to be failing anymore, but it might be because I have "fixed" #19. I say "fixed" because I'm half convinced nothing will ever be fixed again, and that the Lua API is designed to drive me insane. |
@kyren I can still reproduce this |
It's not failing on travis, maybe I was incorrectly assuming travis was x86_64 linux? Let me actually boot up a vm and try it. |
I completely can't reproduce this, all tests pass on my linux x86_64 vm on the latest nightly. I realize that's not super helpful. |
Odd. Meanwhile, I've also reproduced this on ARM (also Linux). My exact rustc version, for the record:
Also reproduces on Maybe it's a difference between C compilers that's exposing UB? I'm using |
I should be more clear, I said:
But what I should have said is I actually have not seen this fail at all so far. |
I have tried gcc version 6.3, but I can try 7.1.1. I've also tried whatever the latest apple clang is, I guess |
Actually can you just tell me what OS you're using, that might be faster than trying to set up a specific compiler version |
Oh, right. Arch Linux. Currently bisecting by the way, so no need to start that if you manage to reproduce. |
|
The following PRs were merged between those: rust-lang/rust#43180 |
Nice sleuthing! |
Wow, thanks for putting all the effort into this bug! The only difference is that I'm running a slightly different kernel now, I'll check out and see if that makes a difference although it shouldn't affect anything really. Just to make sure: You did run with I'll look into more ways to properly debug/reproduce this, then. |
AHA, thank you, I was missing that part. NOW I can reproduce it. I had completely forgotten that was not the default mode of test, because I don't usually run cargo directly, I have a Makefile that has a test target in it, and THAT sets --release. Sorry for all that! |
Ah, okay, so I'm not going insane yet. Good :) |
Forget everything I said before, it's reproducible without a VM, you just have to ACTUALLY run in release. PEBKAC |
Okay, for real this time, I think this is fixed by 44c99ea. I'm sure error_guard was subtly wrong somehow with regards to.. I guess maybe mutable aliasing? I'm not sure, because I abandoned the approach as too difficult to do correctly. |
So, really sorry about not seeing the bug at first, you were absolutely correct that there was a problem. Now that I have a bit of time to breathe, are we possibly back to the "only" problem left being that errors in __gc metamethods are unsound? Are all the rest of these fun soundness issues solved? I have no idea how to even approach dealing with errors in __gc metamethods, other than simply writing a LOT of protected versions of ffi calls. I don't think that's really a workable solution, it would be massively slow and awkward, as well as slow at runtime. SURELY there must be a better way than that? Is there some way possibly of forcing errors in __gc to simply always panic? That's precisely what would happen when called from the top level of Lua, but in any sort of callback, there's always going to be a protected lua call somewhere up the stack that has called setjmp that will handle it (which is precisely what we don't want). If I even had an idea of how to approach it I would feel much better, right now I really don't know how to proceed |
Great, that seems to fix it for me as well! Apart from what you mentioned, there's also #21, but it's unclear if my assumptions about setjmp/longjmp always being UB are correct.
Since there's no way except pcall to find out that this happens, no. Well, technically... |
Yeah, that was actually on my list of things to do was use the default allocator as the Lua allocator, and do exactly that, panic on failure to allocate. I should have also mentioned the longjmp from rust problem as one of the remaining issues. I thought about disabling gc during callbacks, but I had another thought while I was in the shower. It could be possible to much more cheaply wrap Lua function calls by using thread local storage to store the arguments and return values to protected versions of calls, and if that's possible I could maybe use macros to much more cheaply (in both runtime and programmer time) protected call wrap things? That, combined with stack_guard, could mean that I would just call a very large number of protected calls in most API entry points. MAYBE that would be okay? |
Possibly. I'm not sure how exactly that would look, though. Anyways, closing this issue since it was fixed! |
After a long debugging session, I've noticed that
cargo test --release
fails with a segfault on the current nightly.This might as well be a regression of the compiler, I'm not yet sure.
The text was updated successfully, but these errors were encountered: