Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we / can we make MaybeUninit<T> always preserve all bytes of T (including padding)? #518

Open
RalfJung opened this issue Jul 26, 2024 · 63 comments

Comments

@RalfJung
Copy link
Member

It is a frequent source of confusion that MaybeUninit<T> is not just preserving all the underlying bytes of storage, but actually if T has padding then those bytes are lost on copies/moves of MaybeUninit<T>.

This is currently pretty much a necessary consequence of the promise that MaybeUninit<T> is ABI-compatible with T: some ABIs don't preserve the padding of T when it is passed to a function. However, this was not part of the intention with MaybeUninit at all, it is something we discovered later.

Maybe we should try to take this back, and make the guarantee only for types without padding?

I am not even sure why we made this a guarantee. We made the type repr(transparent) because for performance it is quite important that MaybeUninit<$int> becomes just an iN in LLVM. But that doesn't require a stable guarantee. And in fact it seems like it would almost always be a bug if the caller and callee disagree about whether the value has to be initialized. So I would be curious about real-world examples where this guarantee is needed.

@elichai
Copy link

elichai commented Jul 26, 2024

I'll add that a typed copy of an uninitialized variable is UB in C, so there's no need to promise any ABI for FFI compatibility,
So this leaves us with the "Rust" ABI which isn't stable anyway.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 26, 2024 via email

@Diggsey

This comment was marked as off-topic.

@RalfJung

This comment was marked as off-topic.

@carbotaniuman
Copy link

The PR that introduced the guarantees does not talk about padding, and it seems like that wasn't really understood back then. The t-lang minutes discussing this are lost to time and reorganizations, but it seems doubtful that such a consideration was raised. Discussions from 2018 raise a lack of real-world use cases for ABI compatibility, and I agree with such a sentiment in the present.

I don't this this would be approved nowadays, but I am incredibly apprehensive about removing it. There are few places in the Rust documentation that use always for guarantees like this, and the use cases for some weird FFI thunks or bindings would be nigh-impossible to properly test with crater or similar...

@Diggsey

This comment was marked as off-topic.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 26, 2024

@carbotaniuman I think we should consider removing it. If we can't come up with any legitimate usecase, I think we should definitely remove it. I don't like going back on a promise like this, but if we don't have a usecase that could be broken by taking back this promise, then the chances that someone is affected should be very slim.

@Diggsey thanks for explaining why you think this belongs in this thread. But I disagree. "MaybeUninit preserves provenance" is not relevant here. You will note that provenance does not appear in the issue description. Furthermore, provenance on CHERI works like it does everywhere else, so even if provenance were relevant, CHERI wouldn't change anything. It is true that you can write code with MaybeUninit that will work everywhere but not on CHERI; discussing that is off-topic here as the reasons are completely different from what this thread is about and also different from what you mentioned -- it is caused not by padding and not by provenance, but by capabilities. So please take this elsewhere, e.g. Zulip or a new issue where you explain why you think MaybeUninit's provenance behavior is incompatible with CHERI, but I'd prefer not to see yet another thread derailed by CHERI. I like CHERI and want to see it work in Rust, but our main task here is to figure out Rust for the existing targets we already support. CHERI support is a nice extra that I'll happily discuss, as long as it doesn't distract from our core task.

@carbotaniuman
Copy link

If we do agree to remove the guarantee, I expect it to break 0 uses in practice. My only other concern would be the performance impact of having to copy more bytes. It probably won't affect SIMD or buffers though, so I don't really think that's it's really an issue.

@bjorn3
Copy link
Member

bjorn3 commented Jul 26, 2024

I think we should preserve the memory layout compatibility, but drop the calling convention compatibility. That could be done using repr(C) instead of repr(transparent) I think.

@chorman0773
Copy link
Contributor

FTR, I use MaybeUninit<T> in the signatures of lccc's libatomic and libsfp ABI-level routines. This is because they get called from xlang's codegen, and xlang allows uninit for those operations. Though in these cases, they are types without padding in the signature.

However, for compatibility with gcc/clang, they have to expose an ABI equal to the rountines using primitives.

@chorman0773
Copy link
Contributor

(And in general, I agree with @carbotaniuman - unless crater is testing all kinds of targets, I'm betting it primarily tests x86_64, where aggregate-of-one-field will get passed the same way as that one field*, so without using miri-crater, the ABI checks won't be found by crater. If the code is used on something like arm32 though, it's going to be very visibility broken)

@RalfJung
Copy link
Member Author

Yes, this is super hard to test for. I wonder if it's worth having a blog post asking people whether they need this guarantee...

I think we should preserve the memory layout compatibility, but drop the calling convention compatibility. That could be done using repr(C) instead of repr(transparent) I think.

Yes, concretely the proposal would be:

  • T and MaybeUninit<T> always have the same size and alignment
  • they have the same ABI if T has no padding

Or maybe "no padding" should be restricted a bit further, like "if T is a primitive integer/float/pointer type" or so. Note that some non-power-of-2 SIMD types have padding so we have to be careful if we want to talk about those types.

@Ddystopia
Copy link

Is the only motivation to backing up on that promise is the fact that this is a frequent source of confusion? Which benefits except clarity can Rust gain?

@RalfJung
Copy link
Member Author

We never intended MaybeUninit<(u8, u16)> to have a padding byte that would be lost on copies. This is a complete accident. It's not just a source of confusion, it's not the semantics we want. It came up in #517 where it means that returning a MaybeUninit<T> from an atomic compare-exchange actually doesn't work since padding bytes still get lost so we won't end up having the same bit pattern as what is stored in the atomic location.

@jamesmunns
Copy link
Member

jamesmunns commented Jul 26, 2024

I'm pretty sure this is still the case, but it might be worth it to enumerate things that ARE still allowed for this wrt FFI/ABI concerns. My primary use of MaybeUninit<T> in FFI is for "outptr" usages (edit: specifically &mut MaybeUninit<T> or *mut MaybeUninit<T>), which seems to be still good (because we never copy/pass by value - the part that is discussed by this issue), but for folks like myself might be worth spelling out clearly/contrasting what is no longer allowed.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 26, 2024

Yes, ABI compatibility is about the "by-value" part of a function argument or return type. That's how we've consistently been using this term for a while now, also see our glossary and the documentation on ABI compatibility.

In public communication we'll obviously spell out the details more than in internal discussion. ("Internal" not as in "private" but as in "among the team members and anyone else who's willing to participate".)

@chorman0773
Copy link
Contributor

Or maybe "no padding" should be restricted a bit further, like "if T is a primitive integer/float/pointer type" or so. Note that some non-power-of-2 SIMD types have padding so we have to be careful if we want to talk about those types.

I at the very least need target simd types as well - for floating-point types that aren't directly supported by rust (e.g. f2x64_t), I wrap them in a target-specific ABI type, which on x86_64, is mostly __m128.

@RalfJung
Copy link
Member Author

Since that is a compiler-internal concern, you could also do this by providing more ABI guarantees than what Rust provides in general.

But that case would be covered by "types without padding", or we could explicitly mention the stdarch SIMD types (since they are all powers of 2).

@chorman0773
Copy link
Contributor

Since that is a compiler-internal concern, you could also do this by providing more ABI guarantees than what Rust provides in general.

Not fully - you don't necessarily need to compile the rtlibs with lccc themselves, they're written in mostly portable rust, and quite deliberately. I'd like to be able to continue providing that guarantee.

Yes, ABI compatibility is about the "by-value" part of a function argument or return type. That's how we've consistently been using this term for a while now, also see our glossary and the documentation on ABI compatibility.

You can also now see a formalization in reference#1545, as a note.

@RalfJung
Copy link
Member Author

This has caused an actual soundness bug now: rust-lang/rust#134713.

I think we should seriously consider restricting the ABI compatibility guarantee to scalar and SIMD types.

@chorman0773
Copy link
Contributor

The issue with that is that will make it hard to represent "Maybe Initialized" aggregate types in ABIs.
I already know I need this for fn() since I use this (via a MaybeValid1 wrapper), in place of void(*)(int) in signal. Depending on some other design details of my OS (which I'm currently in the process of implementing a wine-like compatibility layer for running programs for it in Rust), I might need this behaviour (either via the MaybeValid wrapper or via MaybeUninit itself) for at least one aggregate type (at the very least, if I want to keep the strong typing my OS's API has been exporting).

Footnotes

  1. MaybeValid is a repr(transparent) wrapper arround MaybeUninit<T> that disallows uninit bytes wherever T does as a safety invariant. In particular, the type is allowed to contain "Any Initialized Bit patttern" or any safe value of T, minus padding bytes of T.

@RalfJung
Copy link
Member Author

Yeah I don't think a by-value ABI-compatible "maybe init" aggregate is common enough to justify the constant stream of surprises and UB that this problem causes. I would suggest not designing your OS around such a facility.

@RalfJung RalfJung changed the title Should we / can we make MaybeUninit<T> always preserve all bytes of T? Should we / can we make MaybeUninit<T> always preserve all bytes of T (including padding)? Feb 23, 2025
@carbotaniuman
Copy link

I had a longer post here that I since felt was too confrontational, but my thoughts have changed and I do not believe that solely changing this is justifiable given that this is not a breaking change for soundness, but merely to make the use of the API better for users. Unsafe Rust already has multiple sharp edges (SB/TB, Box noalias, provenance), and I feel like this is not a particularly sharp one once users understand it.

I would also like to echo the alternative of a new type BikeshedMaybeUninitBagOfBits, or if desire is expressed to retake the better name for the more useful use-case, BikeshedMaybeUninitIgnoringPadding. This to me feels similar to the exposed provenance methods in that we may not like them, but they've ossified (in this case for nearly 6 years), so providing both a good option with the obvious semantics as well as a way to express the use-cases others may care about with regards to ABI is a good compromise. We could even hang this on an edition!

@ia0
Copy link

ia0 commented Feb 23, 2025

I might be missing something obvious, but isn't BikeshedMaybeUninitBagOfBits<T> the same as [MaybeUninit<u8>; size_of::<T>()]? This has limitations today because of generic_const_exprs (in particular you can't fix the stdlib soundness bug with this yet), but ultimately it seems to me the current definition of MaybeUninit seems the most expressive one (making it irrelevant to also have BikeshedMaybeUninitBagOfBits).

@saethlin
Copy link
Member

I would also like to echo the alternative of a new type BikeshedMaybeUninitBagOfBits

What properties does this type have?

@carbotaniuman
Copy link

BikeshedMaybeUninitBagOfBits is the hypothetical MaybeUninit that preserves all bytes of T like is being proposed in this issue. This was called bag of bits semantics in the past which is why the bikeshed is named that.

@chorman0773
Copy link
Contributor

chorman0773 commented Feb 23, 2025

I would suggest not designing your OS around such a facility.

The issue is that the design isn't simply "Whether or not to use MaybeUninit in signatures". The issue that comes up is "What do I have to do on the rust side to match this C API while being compatible with other design decisions of the OS". Much of the SCI (System Call Interface) surface for the OS takes two-pointer aggregates by value instead of by pointer. In the past, when signals were a native part of the OS (and not emulated atop uSEH), sigset_t was defined as a struct wrapping a [u64; 2]). I'd rather those decisions not make it impossible to define other APIs in Rust.
Berkely Sockets will also be fun given that SOCKET likely needs to be a struct of a handle + metadata. And I may need to handle uninitialized SOCKETs - or at the very least, MaybeValid<SOCKET>s.

@RalfJung
Copy link
Member Author

RalfJung commented Feb 27, 2025

In general, I already use this to represent the signature of signal, with a MaybeValid<extern "C" fn(i32)> wrapper.
Also, signal is pretty much where this comes into play, since the actual "signature" of a sigaction is extern "C" fn(i32, siginfo_t , ucontext_t ), but to support exception->signal translation properly (In particular, allowing the program to call raise from a signal handler, when recursing into the exception handling entrypoint in userspace causes a program abort), I believe I actually need to pass the two parameters by value to a trampoline via the kernel (and thus cannot use something like Option) and then pass pointers. However, this won't always be fully initialized when SA_SIGINFO is not used - in particular, initializing the ucontext is an expensive operation, since it requires moving a lot of data from a kernelspace buffer to a userspace one.

Sorry, I don't follow. Please strip all the detail that's unnecessary for this thread and focus on the actual question: passing some uninitialized data across the ABI. You seem to be saying you want the ucontext_t to be uninit. However, there is no ucontext_t argument in the signature you showed. There is a ucontext_t* argument. Do you want the pointer itself to be uninitialized? That seems entirely unnecessary? Just make it a null pointer, or whatever. Also, (thin) raw pointers are scalar types, so if you really want to do this, MaybeUninit<*const ucontext_t> and *const ucontext_t are actually still ABI-compatible under my proposed changed spec. I am also fairly sure that leaving that pointer uninitialized is UB in C.

If it's just about leaving the data the pointer points to uninitialized, then that's entirely off-topic for this discussion as the ABI does not care about that. *const T and *const U for arbitrary sized types T, U are already declared ABI-compatible and I am not suggesting changing that.

Where is the documented guarantee that MaybeUninit preserves padding bytes?

We don't document this. But people still think it is true, I've myself thought it is true, and it is quite reasonable to think it is true. It's also a much more useful guarantee than the ABI compatibility one. We're still looking for even a single use-case that needs the ABI-compatibility for non-scalar types, whereas we already had a standard library correctness bug due to the lack of padding byte preservation.

@chorman0773
Copy link
Contributor

So the signature I need to actually call looks more like extern "C" fn(i32, siginfo_t, ucontext_t) (note the lack of pointers). There would then be a rust trampoline that gets "called" from the exception handler, which takes the address of the latter two parameters before calling the signature. The case where ucontext_t will be uninitialized though, the C function that gets called won't actually have the two parameters.

@RalfJung
Copy link
Member Author

RalfJung commented Feb 27, 2025

The case where ucontext_t will be uninitialized though, the C function that gets called won't actually have the two parameters.

Uh, then, I'd suggest you call it with a Rust function pointer / declaration that has the matching number of parameters? It's anyway UB if caller and callee disagree on the number of parameters.

@chorman0773
Copy link
Contributor

Its the same vein as calling main from __libc_start_main - actually an impossible task under strict ABI compatibility because you have to match one of 4 different signatures without knowing which signature it is. There are a lot of very low-level shenanigans that abuse abi details like this.

@RalfJung
Copy link
Member Author

RalfJung commented Feb 27, 2025

If you are anyway already causing technical UB by having the signature not match, you shouldn't be worried about venturing into "unspecified" land here either. Just have a version of ucontext_t where every scalar field is wrapped in MaybeUninit; that can still be fully uninitialized and you don't need to rely on MaybeUninit<Aggregate> doing anything specific with the ABI. We don't currently guarantee that to be ABI-compatible (we don't have a structural congruence rule for ABI compatibility of repr(C) structs), but you're already breaking our ABI compatibility rules so that's "fine".

@workingjubilee
Copy link
Member

Did these standard library implementations surface this bug before or after you recommended changing the implementations of these things to use MaybeUninit?

@RalfJung
Copy link
Member Author

RalfJung commented Feb 27, 2025

That sounds like you are trying to insinuate something -- I don't know exactly what you are referring I don't know what I recommended when, but using T instead of MaybeUninit<T> would not have helped.

Let's not discuss the history of rust-lang/rust#134713 here please; there's already a separate issue for that. The fact is that that implementation made it in (neither authored nor reviewed by me), which is a pretty clear sign that "bag of bytes" semantics are (a) useful, and (b) easy to assume to already be the current semantics even if they are not.

@chorman0773
Copy link
Contributor

chorman0773 commented Feb 27, 2025

If you are anyway already causing technical UB by having the signature not match, you shouldn't be worried about venturing into "unspecified" land here either

Ffor the record here - the call that I'm worried about is going to be to the Rust trampoline - which does always have the same parameters. Then I call the actual sa_sighandler or sa_sigaction from that trampoline with appropriate pointers.

But in the case of handler, the ABI level is very well-defined (even if the Rust Level isn't). Whereas wrapping MaybeUninit<ucontext_t> and MaybeUninit<siginfo_t> is not quite as well-defined.

@chorman0773
Copy link
Contributor

Also I speak of a position of being involved in the discussion here - a third party having absolutely zero idea this is happening may have just as much cause to rely on a stable language guarantee. And given much of this code just won't even run in miri (miri won't even run winter-lily, which is my current project touching Lilium), this is probably in the realm of "Breaks silently, until it doesn't".

@Lokathor
Copy link
Contributor

Changing how pointers to MaybeUninit works does appear to be in the proposal, and MaybeUninit integers would also specifically not be affected, so I guess you're concerned that people are passing MaybeUninit<SomeStruct> by value to/from an extern "C" function? Do I have that right?

@RalfJung
Copy link
Member Author

RalfJung commented Feb 28, 2025

@chorman0773

Yes I am proposing to take back a documented language guarantee, and replace it with a different, more useful guarantee. I think long-term this will cause less harm. So I was asking if there's any cases where the ABI guarantee is useful or even needed.

I am still extremely confused about your example. You keep bringing up more and more concepts you're not explaining and there's too many parties calling each other so it's not even clear which call you are talking about when. When I think I understand is that there's a particular function call where the caller uses signature extern "C" fn(i32, siginfo_t, ucontext_t) (but you never even stated that signature! you stated a different one, and then later said that's not the real signature). But then based on some ambient information the caller might know that the callee is actually declared as extern "C" fn(i32, siginfo_t) and therefore you want to leave the last argument uninitialized. In other cases all 3 arguments exist and you want to actually pass the data, and then the ABI must of course match. But there's also a "Rust trampoline" involved somehow and now you completely lost me again.
There should be a single function call that matters, where caller and callee use different but ABI-compatible types. Please explain everything about that call, and leave away everything else.

Could you achieve your goals without the ABI guarantee (and without worsening performance)? And if yes, would that solution be any less "natural" than what you are currently doing? Frankly, based on what you said so far, any possible alternative seems more natural to me. ;)

@Lokathor
Copy link
Contributor

(sorry for any confusion Ralf, but my own question was directed at chorman)

@chorman0773
Copy link
Contributor

chorman0773 commented Feb 28, 2025

Ok, I'll restate what the flow is:

  • The Lilium Kernel has a core concept called an exception handler. This is invoked by the kernel (in much the same manner that a POSIX signal handler is invoked - asynchronously or synchronously, and assume same sort of ordering constraints on an asynchronous exception)
  • For certain exception types, the userspace runtime invokes a C signal handler. In order to do this, it first invokes a trampoline function. This trampoline always has signature extern "C" fn(i32, MaybeUninit<siginfo_t>, MaybeUinit<ucontext_t>) The exception handler knows when to initialize siginfo_t and ucontext_t but always passes them. This is because ucontext_t in particular is partially initialized with a handle needed to resume handling the previous exception.
    • The trampoline is invoked first because the exception handler is actually "Returning" to it. It needs to resume from the Exception Handling state so that the signal handler can call raise (if it was asynchronous), which in turn is translated to the synchronous entry point to exceptions, being ExceptionHandleSynchronous (calling this on a thread that is currently handling an exception causes the thread to exit with an unmanaged exception)
    • The Trampoline "call" is setup manually, so this relies on knowing (and matching) the ABI for MaybeUninit<siginfo_t> and MaybeUninit<ucontext_t>. This in turn requires an ABI guarantee for those types.
  • The trampoline then knows whether or not to call sa_sigaction (with a signature of unsafe extern "C" fn (i32, *mut siginfo_t, *mut ucontext_t)) or sa_sighandler (with a signature of unsafe extern "C" fn(i32)) (the trampoline also has a bunch of other setup to do to fully support the full set of sigaction options from POSIX). For efficiency reasons, the call is just merged into one without a branch (though this can easily be done from asm).

ucontext_t and siginfo_t are defined in another library that's also used by consumers of the API. The types there probably don't want to let the callee deinit things (especially in ucontext_t which is then read back by the trampoline, before the trampoline goes back into the Exceptionhandling context to resume handling the exception through other means).

@workingjubilee
Copy link
Member

@RalfJung You are the Pope of Rust, or at least the Pope of Rust Safety Models. Whenever you say something, even if you say something blatantly wrong, almost no one calls you out on it. You are the proxy author and reviewer of all std code because everyone is reading everything you are writing and thinking about it when writing unsafe code. You are repeatedly cited in these discussions. If you repeatedly say something wrong, you can convince other people it's true, simply by repeating the wrong thing.

@RalfJung
Copy link
Member Author

RalfJung commented Feb 28, 2025 via email

@workingjubilee
Copy link
Member

workingjubilee commented Feb 28, 2025

That is a reasonable stance, honestly.

I just am not surprised incorrect code is written based on something you say again and again, and don't think it should be taken as evidence the confusion is that widespread if it might instead be your confusion spreading widely.

@RalfJung
Copy link
Member Author

RalfJung commented Feb 28, 2025 via email

@comex
Copy link

comex commented Feb 28, 2025

@chorman0773

Ok, I'll restate what the flow is:

Okay. It sounds to me like even if MaybeUninit's ABI compatibility guarantee is restricted to primitive types and its ABI for other types is unspecified, you have some options:

  1. Define MaybeUninitSiginfo and MaybeUninitUcontext structs that are like siginfo_t and ucontext_t but wrap the individual fields in MaybeUninit.

    This would still be guaranteed ABI-compatible with siginfo_t/ucontext_t by-value parameters in C.

    Maybe this is an abstraction violation, but it sounds like it shouldn’t be the end of the world since this is such a special case.

  2. Change the trampoline to take siginfo_t and ucontext_t by pointer, i.e. extern "C" fn(i32, *const siginfo_t, *const ucontext_t).

    After all, it's not like this is some syscall interface where you can't pass parameters via the stack. siginfo_t and ucontext_t are both large structs that will be passed on the stack anyway. Is the issue that the trampoline itself is a stable ABI boundary you don't want to break (separately from the stable POSIX API)? If so, then I think you've already made a mistake. ucontext_t may need to grow over time to account for future architecture extensions, so it should never be passed by value across a stable ABI boundary.

However, I think we can do better. Even if MaybeUninit<T> is not guaranteed to be ABI-compatible with T, it definitely should be guaranteed to have a stable ABI (when passed to extern "C" functions). It shouldn't be something that can change on rustc upgrades.

That should be enough of a guarantee that you could keep doing what you're currently doing. Since you're setting up the call manually, all that matters is that there is some stable ABI, not what it is exactly. The only issue would be if (a) you are worried about breaking an existing stable ABI boundary (for your apparently still-in-development OS), and (b) the stable ABI rustc adopts for MaybeUninit<siginfo_t> or MaybeUninit<ucontext_t> doesn't match what rustc does today. But in practice it is going to match, because the structs were already being passed on the stack.

Beyond that, if MaybeUninit has a stable ABI, then we may as well define it as being ABI-compatible with something. For example, if T is a struct, MaybeUninit<T> could be documented as ABI-compatible with an equivalent struct where u8 fields are added to fill all padding bytes. (This would need to be fleshed out a bit more to deal with nested structs/unions/whatever, but the basic idea should work.) This would ensure that functions taking MaybeUninit by-value can still be called from C or whatever.

…With all that said, I am definitely sympathetic to the alternative view that MaybeUninit<T> ought to remain ABI-compatible with T. It's definitely surprising if they're incompatible. But I also don't want Rust to have to add yet another wrapper type.

@carbotaniuman
Copy link

carbotaniuman commented Feb 28, 2025

I think this breaking change is not justified by the stated goal, which is expressly not any inherent unsoundness, but just a desire to reduce a paper cut for unsafe Rust users. I think that's a great goal (and to be clear, I support it myself), but unsafe code is tricky, with many corner cases. Box noalias remains on the books today despite being a far larger footgun. We have not yet ruled out Stacked Borrows as an aliasing model! MaybeUninit not keeping padding may be surprising, but I think it can also be explained relatively easily and is a small footgun compared to other complexities.

In addition, this creates a special case in the ABI compatibility rules. Windows until a few months ago, passed MaybeUninit by value across FFI boundaries. Now, I'm 95% sure that all of those types are simple pointers, but we've now added a footgun in cases where they aren't. And any cases which relied on this ABI compatibility will likely break in weird and spectacular ways only at runtime - these are the types of cases likely to be underindexed on Crater.

Making MaybeUninit<T> preserve the padding bytes of T also comes with a performance downside - previously these padding bytes were garbage, and the compiler could not pass it across function boundaries (especially in extern "Rust", where this guarantee would definitely apply). Making the padding worthwhile would likely regress common use cases of storing or transferring a maybe uninitialized T in favor of use cases that do need this bag of bits functionality.

I think it's also disingenuous to act like the use cases here are contrived and useless - this is a documented language feature, not RUSTC_BOOTSTRAP. And yes, while niche use cases are indeed niche and some of us here may consider the stabilization to be a mistake, we (speaking as a community) did make the mistake.

I am also confused as to why my compatibility ideas seemed to have been (intentionally?) ignored. MaybeUninit is a vocabulary type today - you cannot write the semantics of it in library code. #[repr(transparent)] on unions has real problems with the design, and may not be stabilized in a "reasonable" amount of time. Migration to something else for the use cases that need it will be all but impossible, or users will just take the hit and write T instead, willfully invoking undefined behavior due to the lack of an alternative.

To me, MaybeUninitIgnoringPadding and MaybeUninitBagOfBits are different types with different use cases. What they may be named or how those capabilities are actually written are immaterial to me, what is important is that we do not throw away a core guaranteed capability without sufficient migration patterns and justification.

@comex
Copy link

comex commented Mar 2, 2025

Making MaybeUninit<T> preserve the padding bytes of T also comes with a performance downside - previously these padding bytes were garbage, and the compiler could not pass it across function boundaries (especially in extern "Rust", where this guarantee would definitely apply). Making the padding worthwhile would likely regress common use cases of storing or transferring a maybe uninitialized T in favor of use cases that do need this bag of bits functionality.

I think you will be hard-pressed to find a case where this makes a measurable difference, even if you specifically microbenchmark the function call.

I'm neutral regarding the rest of your post.

@RalfJung
Copy link
Member Author

I think it's also disingenuous to act like the use cases here are contrived and useless - this is a documented language feature, not RUSTC_BOOTSTRAP. And yes, while niche use cases are indeed niche and some of us here may consider the stabilization to be a mistake, we (speaking as a community) did make the mistake.

It took as like 2 weeks to even get to the bottom of the one use case that was brought up. And then it turns out it's not actually a use case for the ABI guarantee that is documented, but for a weaker guarantee that is entirely compatible with MaybeUninitBagOfBits. So I think "contrived" is a pretty accurate description. "useless" was not used in this discussion; please don't put words in other people's mouths.

To me, MaybeUninitIgnoringPadding and MaybeUninitBagOfBits are different types with different use cases.

We haven't yet seen a single use case for MaybeUninitIgnoringPadding. @chorman0773, it turns out, doesn't need MaybeUninitIgnoringPadding, they just need some stable ABI for MaybeUninitUcontext and MaybeUninitSiginfo.

All evidence points towards MaybeUninitIgnoringPadding being a type that nobody needs. I will also point out that nobody wanted to add that type, it was created by accident.

@carbotaniuman
Copy link

All evidence points towards MaybeUninitIgnoringPadding being a type that nobody needs. I will also point out that nobody wanted to add that type, it was created by accident.

I think this is an example of overindexing on the responses present. The vast majority of use cases for this will be low-level, or a workaround for some legacy code, or maybe to provide some potentially uninitialized data to an assembly function for math or similar. You can probably find issues (undocumented guarantees, weird code, not technically supported) with any or all of these potential use cases. I might even agree with those issues.

I personally do have code running that uses this ABI guarantee, but I don't particularly care about how this issue resolves wrt to that code. If the capability goes away, I will just remove the MaybeUninit and go on with my life. Nor do I really have a desire to litigate the "validity" or non-contrivedness of how my code is written.

As I have said, I am not saying that we should freeze MaybeUninit<T> in amber because of the past, only that the capabilities not be lost.

But maybe it is decided that these capabilities are not worth keeping around. The obvious next step will be a crater run. Much of this code will not be present in crater. It may be private, internal, or using FFI, such that crater cannot really test it. I expect there to be ~0 breakage on said run. Such a number will not accurately reflect the breakage. And as the capabilities are taken away, there will no way to migrate without willfully invoking UB.

Again, maybe that much breakage will be tolerated. We broke an inordinate amount of the ecosystem in the time debacle. This would be by significantly smaller and far less impactful. But from my reading of your comments, I think you seem to be thinking that there will be ~0 actual breakages from this change. To that, I completely disagree.

@CAD97
Copy link

CAD97 commented Mar 12, 2025

No fundamental capabilities are lost, as you can use a compound type of MaybeUninit instead of a MaybeUninit of compound type to express the same exact ABI as is "lost" by this change. It's perhaps not as compositional, but the ability to write the ABI is still present.

And note that for any repr(C) type which is pass-in-memory, likely nothing even changes! To have a case which is broken, you need a case which doesn't preserve padding over some stable ABI, or to be a Rust-Rust call, which is immediate UB if there are any uninitialized bytes in an initialized receiving argument type anyway.

If the capability goes away, I will just remove the MaybeUninit and go on with my life

Even if you don't want to debate the "validity" of whatever hacks you needed to use, it would still be good to see an example of a case where you really do want the ABI for MaybeUninit of a non-scalar to match that of the initialized type. Even if it's just “I have a cursed legacy C API where a correct call provides an uninitialized struct lvalue as a parameter,” that's better than just saying “code could be relying on this.”

I really am sympathetic to being perfectly strict about avoiding language breaking changes. But this really does feel like a case of accidental stabilization that just makes things worse for everyone. Casting fn(MaybeUninit<T>) -> U to fn(T) -> MaybeUninit<U> isn't useful enough to justify nonscalar ABI equivalency nor a second flavor of MaybeUninit.

@RalfJung
Copy link
Member Author

RalfJung commented Mar 12, 2025

@carbotaniuman you are basically saying "trust me I have a usecase but I'm not interested in telling you about it". That's not a constructive contribution to this discussion.

As you said, a crater run is not very useful. The next step is an RFC to get wider awareness of this proposal and make it more likely that if there is some usecase relying on this accidental guarantee out there, we will hear about it.

@carbotaniuman
Copy link

Given your behavior to the other use case presented, I do not believe you are asking this in good faith. My main use cases are effectively what CAD97's described, where I have C (and assembly) code that returns various complex structs that are potentially uninitialized. I would justify this with ASM freeze, but that's verboten, so I am being technically correct by using MaybeUninit<T> here and operating solely on that.

@CAD97
Copy link

CAD97 commented Mar 13, 2025

Could it make sense to specify MaybeUninit's ABI a bit more specifically than just for scalars? Specifically, to guarantee that:

  • for MaybeUninit of a scalar, MaybeUninit has the ABI of that scalar; and
  • for MaybeUninit of a type guaranteed to have an indirect (i.e. in memory, whether at a known stack position or behind a pointer) ABI, MaybeUninit has that same indirect ABI.

Some version of this, if practical to specify and provide, would serve the needs of the existing code using MaybeUninit's ABI, as it keeps the ABI matching in the scenarios that don't cause issues, only changing it for types shaped like #[repr(Rust)] (scalar, scalar) (ScalarPair ABI).


Aside: C code which converts an uninitialized lvalue into an rvalue has undefined behavior by the standard for any type which is not excluded from having trap values (which is essentially LLVM's undef), i.e. is not unsigned char. But with the exception of LTO, Rust does FFI across the OS object file semantics (i.e. machine code semantics), not the C language semantics.

@RalfJung
Copy link
Member Author

RalfJung commented Mar 21, 2025

Given your behavior to the other use case presented, I do not believe you are asking this in good faith.

I am sorry you feel that way. I don't know what I could have done differently to avoid this. Maybe I could have been a bit more patient in how I extracted the details of the use case; I admit I was frustrated since the first explanations we were given were just not useful. But ultimately, if getting to the bottom of a technical question is considered acting in bad faith, we may as well stop having technical discussions altogether. I won't just take it on faith that someone has a use case that they are unwilling or unable to properly describe.

And it turned out I was right in getting to the bottom of this, since "T and MaybeUninit<T> are ABI compatible" turned out to be entirely irrelevant for that use case, as I suspected. What really matters is "MaybeUninit<T> has some well-defined ABI".

The proposed migration plan for cases like that is to move the MaybeUninit down to the fields, so that it only ever wraps scalars. I hope that would also cover your use case. (And as CAD says, it is nearly impossible to return potentially uninitialized structs from C without causing UB. clang in fact adds noundef to all arguments and return values, even character types.)

Could it make sense to specify MaybeUninit's ABI a bit more specifically than just for scalars? Specifically, to guarantee that:

This is getting very close to the edge of my knowledge of ABI details, so I feel uncomfortable making definite statements here. Deciding ABI things without knowing enough about ABI is what got us into this situation in the first place.

In particular, there's somewhat of a layering violation here: the Rust compiler and language, and the docs, don't really have a concept of which types would have an indirect ABI. So there's no proper way we could even set up that definition in the current framework -- we'd have to make that framework a lot more complicated first.

@carbotaniuman
Copy link

carbotaniuman commented Mar 21, 2025

since "T and MaybeUninit are ABI compatible" turned out to be entirely irrelevant for that use case

I think this is misleading, and being able to spell out the struct as a bunch of MaybeUninit scalar fields to match the underlying struct is useful as a migration, but to call this an alternative is the peak of malicious compliance :/. There's also no guarantee that the structs will be laid out in the way I expect. Fundamentally, on one side, I have a function fn(Args) -> Foo, (where uninitness is not a factor because we are on the ABI level), and on the Rust side I would like to call it like fn(Args) -> MaybeUninit<Foo> (because uninitness does matter in Rust).

In my opinion, I think that relies on the ABI compatibility guarantee that was made. My other alternative is a type that can represent this ABI (MaybeUninitIgnoringPadding) that is not tied to MaybeUninit name, but that suggestion has been repeatedly ignored so whatever. This to me epitomizes the bad faith argument undertaken here, where the use cases provided are being ignored and minimized as much as reasonably possible, while potential compromises are ignored and not even addressed.

Ultimately I no longer really have an opinion on this change - it will likely be rammed through anyways. I would like to say if we don't make it easy (or even possible) for people to do the use cases that they want in the correct way, I suspect that they will just not. Personally, it's looking like the best "migration" if this were to occur is simply willfully invoking UB, and I'll likely be doing that for my code in order to immunize myself against this change.

@comex
Copy link

comex commented Mar 23, 2025

Fundamentally, on one side, I have a function fn(Args) -> Foo, (where uninitness is not a factor because we are on the ABI level)

One of the points Ralf is trying to make is that that's not a thing.

If you're interoperating with raw assembly, then there is no uninit, but there is also no such thing as fn(Args) -> Foo, only registers and stack. You have to think about ABI lowering manually on a function-by-function basis. So if you are starting from scratch, there is no reason why generic ABI guarantees of the form "Foo is the same as MaybeUninit<Foo>" would be useful at all. Now, in practice, nobody is starting from scratch, so maybe you have a pile of existing assembly functions with existing C or Rust function signatures, and now you want to change the signatures to be more uninit-aware. In this case, generic guarantees would help but they're not really necessary; you would just need to verify that the ABI still matches for each of the specific functions you're looking at. Most of the time it will.

Is that your use case?

If so, I can understand how going through assembly functions would be a pain, and the backwards-compat break is inherently a pain, but I don't understand how it becomes as huge a problem as you're suggesting.

On the other hand, if you're interoperating with C, then there is such a thing as function signatures, but there is also such a thing as uninit and UB. Which gets into this awkward situation where there is a ton of C code that either (a) is UB but nobody cares, or (b) is not UB according to the spec but the compiler optimizes it as if it were UB. And Rust is trying to be stricter on that front. So perhaps you want to add MaybeUninit on the Rust side but not the C side (because C doesn't have this concept), even though from the compiler's perspective that's pretty weird because the optimizations are the same on both ends.

Is that your use case?

If so, then I actually do understand how this could potentially be a huge problem. The C side may be UB, but it works (presumably), and the function signatures may be baked into legacy code, so it makes sense to want to only use MaybeUninit on the Rust side. And if the code is portable, then analyzing the ABI on a case-by-case basis is not possible, so the only solution is to push MaybeUninit down to struct fields. Which is possible but very nasty.

Though the scope of the nastiness is still unclear, since most C libraries don't do a lot of passing structs by value.

Ultimately, I'm wildly speculating here, because you haven't explained your use case. You need to stop with the charged language and explain, or else we will all continue to not understand each other.

@carbotaniuman
Copy link

I apologize, given the complexity of the project I have tried to give the guarantee I actually need, but I'll provide as much detail as possible here. The project I used for this has several layers, some of which are in C, some of which are in assembly, some of which are in a custom glorified macro assembler, and some of them are in a custom DSL written in C. The usage of these languages are pretty normal, with a portable C implementation alongside some custom assembly (and DSL) implementations to better take advantage of the hardware. The actual purpose of the library is DSP-y things, so performance is relatively important.

The actual interface that a user would use is of course a relatively normal C interface. Given the maintainability of the tech stack however, it would be nice to move some of this to Rust. Unfortunately this exposes us to the bad internal interfaces. For instance, some callsites in C pass along something like:

// I believe this would be directly broken by this change
struct Renamed {
    char foo;
    unsigned long long int bar;
};
struct Trimmed {
    long int foo, bar, baz;
    int num[16384];
};
// Renamed is passed uninit into this function
Renamed process(Renamed a);
// Example is passed uninit into this function
Example do_stuff(Example a, size_t in, size_t *out);

These args are passed indirectly (I think on all ABI, but I'm decidedly not an ABI expert), and while they really should be a pointer, making that changes is not really easy given the current state of that codebase.

If you're interoperating with raw assembly, then there is no uninit, but there is also no such thing as fn(Args) -> Foo, only registers and stack. You have to think about ABI lowering manually on a function-by-function basis.

Sure, but ultimately an assembly function can fulfill(?) some ABI lowering such that it is the same as a fn(Args) -> Foo. Technically speaking, it is an assembly function that for instance expects certain parameters to be passed in registers, but I think that's relatively unhelpful for wanting to actually call this function from C or Rust.

On the other hand, if you're interoperating with C, then there is such a thing as function signatures, but there is also such a thing as uninit and UB. Which gets into this awkward situation where there is a ton of C code that either (a) is UB but nobody cares, or (b) is not UB according to the spec but the compiler optimizes it as if it were UB. And Rust is trying to be stricter on that front. So perhaps you want to add MaybeUninit on the Rust side but not the C side (because C doesn't have this concept), even though from the compiler's perspective that's pretty weird because the optimizations are the same on both ends.

The C code is compiled with a legacy compiler that does not optimize usage of uninit memory, or else I doubt the code would currently be working.

Is that your use case?

Yes, this is basically the use-case, with some extra context given by the responses above.

And if the code is portable, then analyzing the ABI on a case-by-case basis is not possible, so the only solution is to push MaybeUninit down to struct fields. Which is possible but very nasty.

Ultimately I think this is the main contention I have here. Writing MaybeUninitIgnoringPadding<T> means I am 100% confident that no matter what T is, the ABI will be the same, except that uninitialized memory will be allowed in the type. Any carve-outs means that I have to go through and audit the actual type, and potentially write custom structs where MaybeUninit<T> is pushed to the scalars.

Pragmatically, does it matter that I used MaybeUninit<T> in the Rust interfaces instead of just T? Probably not, beyond being "more correct" and self-documenting, but given my involvement in these sort of discussions I am the type to try to make my code as correct as possible. And I think if I had chosen T, I think the compiler would be far less likely to break my code than this change would be. I think that having taken the time out to be more careful by using MaybeUninit, this sort of code should not be broken wholesale without keeping the underlying ability to write these ABIs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests