-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TVM][RUST] Heap corruption when using FFI for BackendParallelLaunch
#1226
Comments
BackendParallelLaunch
BackendParallelLaunch
can you elaborate a bit on this? |
For parallel lambdas which use a temporal workspace, LLVM will emit %70 = getelementptr inbounds float, float* %4, i64 %69
%71 = bitcast float* %70 to <8 x float>*
store <8 x float> %68, <8 x float>* %71, align 32, !tbaa !149 where |
This sounds like a bug to me in the parallel code generator(or the workspace allocator)which do not respect the alignment requirements |
Specially, we do require temporal workspace also align to a minimum alignment and the compiler takes advantage of that |
Yes, this was most definitely a bug in my implementation of the workspace allocator. It's fixed now, so threading works in the rust runtime. Figuring this out required bisecting the generated llvm code with debugging statements, which isn't the best developer experience, but I suppose this is unavoidable. |
Gotcha, maybe we should put a heavy comment on workspace and DeviceAPI interface to warn this potential issue to the developers. This is certainly something that I would have overlooked as well |
/*!
* \brief Backend function to allocate temporal workspace.
*
* \note The result allocate spaced is ensured to be aligned to kTempAllocaAlignment.
*
* ...
*/ This is totally my fault. |
When using a FFI for
TVMBackendParallelLaunch
, even heap allocating a single byte corrupts the resulting computation.One possible cause is that there's some unintentional malloc/free happening when constructing the flambda closure.
Another (probably more likely) possibility is that I've incorrectly set a struct field wrong somewhere. Parallel for basic TVM ops works, after all.
For reference, Rust uses jemalloc.
Steps to reproduce
The text was updated successfully, but these errors were encountered: