Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support slicing a KnownLayout type #1290

Open
2 of 6 tasks
joshlf opened this issue May 17, 2024 · 8 comments
Open
2 of 6 tasks

Support slicing a KnownLayout type #1290

joshlf opened this issue May 17, 2024 · 8 comments

Comments

@joshlf
Copy link
Member

joshlf commented May 17, 2024

Progress


Given a &[T], Rust supports creating a new &[T] referring to a subset of the original referent, and also supports splitting into two non-overlapping &[T]s that, together, cover the original referent. To my knowledge, this is not supported for slice DSTs.

We could add APIs for these slicing operations to KnownLayout<PointerMetadata = usize>.

One use case for this design would be to support packet formats with explicit length fields - it would be a possibly lighter-weight way of supporting #1289 without requiring solving the general problem described in that issue.


Suggestion thanks to @kupiakos: Currently the API looks like this:

pub unsafe trait SplitAt: KnownLayout<PointerMetadata = usize> {
    fn split_at(&self, l_len: usize) -> Option<(&Self, &[Self::Elem])>
       where Self: Immutable { ... }

    fn split_at_mut(
        &mut self,
        l_len: usize,
    ) -> Option<(&mut Self, &mut [Self::Elem])> { ... }
}

We want to make two changes:

  • Support cases in which there is guaranteed to be no padding (thanks to IntoBytes)
  • Support shared references when Self: !Immutable, adding padding overlap as an error condition

We can support these uniformly like so:

pub unsafe trait SplitAt: KnownLayout<PointerMetadata = usize> {
    fn split_at(&self, l_len: usize) -> Result<(&Self, &[Self::Elem]), SplitAtError<&'_ Self>> { ... }

    fn split_at_mut(
        &mut self,
        l_len: usize,
    ) -> Result<(&mut Self, &mut [Self::Elem]), SplitAtError<&'_ mut Self>> { ... }
}

pub struct OutOfBoundsError;

pub struct PaddingOverlapError;

pub enum SplitAtError<Src, PaddingOverlap = PaddingOverlapError> {
    OutOfBounds(OutOfBoundsError),
    PaddingOverlap(PaddingOverlap),
}

We can then provide methods which convert SplitAtError<_, PaddingOverlapError> to SplitAtError<_, !> with certain bounds:

  • SplitAtError<&T, PaddingOverlapError> -> SplitAtError<&T, !> where either:
    • T: Immutable
    • T: IntoBytes
  • SplitAtError<&mut T, PaddingOverlapError> -> SplitAtError<&mut T, !> where T: IntoBytes
@joshlf
Copy link
Member Author

joshlf commented Mar 9, 2025

If we also encode the trailing slice element type in KnownLayout (e.g. via an associated type), then we could support a more general split_at operation which would return (&Self, &[Self::Elem]) or (&mut Self, &mut [Self::Elem]).

@joshlf
Copy link
Member Author

joshlf commented Mar 9, 2025

If we also encode the trailing slice element type in KnownLayout (e.g. via an associated type), then we could support a more general split_at operation which would return (&Self, &[Self::Elem]) or (&mut Self, &mut [Self::Elem]).

I wonder if there's some way that we could provide an associated element type only when PointerMetadata = usize. Could we teach Rust to only expect such a type to exist when PointerMetadata = usize?

@jswrenn
Copy link
Collaborator

jswrenn commented Mar 9, 2025

I wonder if there's some way that we could provide an associated element type only when PointerMetadata = usize.

We can probably solve this problem alongside #2149 in 0.9.

In the meantime, I've been experimenting on a branch with having TrailingElem = () for sized types, and that works totally fine. If we want, our surface-level API can require KnownLayout<PointerMetadata = usize> for split_at.


The interaction of dynamic padding and splitting is weird. Not insurmountably weird, but the semantics are going to be tricky to document. In particular, .split_at(0) cannot always mean split at the beginning of the trailing slice. To illustrate, let's assume that invocation does have this meaning and mentally execute this code:

use zerocopy::*; // 0.8.16

#[derive(FromBytes, KnownLayout, Immutable)]
#[repr(C)]
struct Foo(u16, u8, [u8]);

fn main() {
    let bytes = [0, 0, 1, 2];
    let foo = Foo::ref_from_bytes(&bytes).unwrap();
    let (bar, trailing) = foo.split_at(0);
    assert_eq!(trailing, &[2]);
}

bar is either an invalid pointer (because it references too few bytes to fulfill Foo's padding requirements), or it overlaps with trailing (which might be valid in this case, but is extremely weird and is definitely invalid if anything is mutable).

Rather, for any DST, there's a minimum element index at which the split could occur, and the parameter passed to split_at is a positive offset with respect to this minimum index. (Or 0 can mean start-of-slice, and we document that the operation fails unless the argument is greater than the minimum index at which the split can occur.)

@jswrenn
Copy link
Collaborator

jswrenn commented Mar 9, 2025

Another gnarly type for our bestiary:

#[repr(C)]
struct Foo(u32, u8, [[u8; 3]]);

In this case, there needs to be at least three elements in the trailing slice.

@joshlf
Copy link
Member Author

joshlf commented Mar 9, 2025

Maybe the thing to do is just to require Self: IntoBytes so that the API can work with any index and has simpler semantics.

@jswrenn
Copy link
Collaborator

jswrenn commented Mar 9, 2025

Maaaaybe. For the sake of minimizing API bloat, I'd really like to see us approach this as a generalization of PtrInner::split_at.

@joshlf
Copy link
Member Author

joshlf commented Mar 9, 2025

Keep in mind another edge case: there may be certain indexes which are valid and certain ones which are invalid, and there may be arbitrarily many invalid indexes. For example, with alignment 2 and trailing [u8], half of all slice lengths require padding.

I struggle to think of cases in which people would want to use this API with such a type, as they'd have to either accept runtime validation or have some very odd requirements such that they know for certain that they'll only ever slice using valid slice lengths.

jswrenn added a commit that referenced this issue Mar 10, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 13, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 17, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 17, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 17, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 17, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 17, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 17, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 18, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 18, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 19, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 19, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 19, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 19, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 19, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 19, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 19, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 19, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 19, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 19, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 19, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 19, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 20, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 24, 2025
Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 24, 2025
This derivable trait provides generalized splitting for slice DSTs.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 24, 2025
This derivable trait provides generalized splitting for slice DSTs.
Splitting a DST `T` produces a pair `(T, [T::Elem])`, where `T::Elem`
is the element type of `T`'s trailing slice. We expose this
functionality as a derivable trait, so as to not violate the privacy
of the trailing slice.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 24, 2025
This derivable trait provides generalized splitting for slice DSTs.
Splitting a DST `T` produces a pair `(T, [T::Elem])`, where `T::Elem`
is the element type of `T`'s trailing slice. We expose this
functionality as a derivable trait, so as to not violate the privacy
of the trailing slice.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 24, 2025
This derivable trait provides generalized splitting for slice DSTs.
Splitting a DST `T` produces a pair `(T, [T::Elem])`, where `T::Elem`
is the element type of `T`'s trailing slice. We expose this
functionality as a derivable trait, so as to not violate the privacy
of the trailing slice.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 24, 2025
This derivable trait provides generalized splitting for slice DSTs.
Splitting a DST `T` produces a pair `(T, [T::Elem])`, where `T::Elem`
is the element type of `T`'s trailing slice. We expose this
functionality as a derivable trait, so as to not violate the privacy
of the trailing slice.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 24, 2025
This derivable trait provides generalized splitting for slice DSTs.
Splitting a DST `T` produces a pair `(T, [T::Elem])`, where `T::Elem`
is the element type of `T`'s trailing slice. We expose this
functionality as a derivable trait, so as to not violate the privacy
of the trailing slice.

Makes progress towards #1290.
github-merge-queue bot pushed a commit that referenced this issue Mar 24, 2025
This derivable trait provides generalized splitting for slice DSTs.
Splitting a DST `T` produces a pair `(T, [T::Elem])`, where `T::Elem`
is the element type of `T`'s trailing slice. We expose this
functionality as a derivable trait, so as to not violate the privacy
of the trailing slice.

Makes progress towards #1290.
@kupiakos
Copy link
Contributor

Alright! Sorry it took me a few days to take a look. 😅

  • The design to defer type bounds to error conversions will help self-validating DST types be more clear and flexible to manipulate. Thank you for working on this! 😁

  • Could the l_len parameter be renamed to mid to match <[T]>::split_at?

  • methods which convert SplitAtError<_, PaddingOverlapError> to SplitAtError<_, !>

    Hmm, from my understanding of the 0.8 error conversions, it would be more consistent to convert directly to OutOfBoundsError because this is going from two error cases to one, like CastError to SizeError. I can still see some use in the generic parameter here.

    • This doesn't preclude a From<SplitAtError<T, P>> for OutOfBoundsError<T> if it carries the type parameter - it'd be consistent.
  • SplitAtError<&T, PaddingOverlapError> -> SplitAtError<&T, !>

    • I request that there be impl<T: IntoBytes + Immutable, P> From<SplitAtError<&T, P>> for OutOfBoundsError so the most common use-case can use From.
    • The conversions to OutOfBoundsError that only require one of the bounds can then be named methods.
  • The error module docs could use some adjustment with these changes, right?

  • Is there any concern about the <[T]>::split_at name overlapping? It wouldn't break any usage to write use zerocopy::SplitAt, but maybe it's worth a mention in the doc examples how you use this with regular slices?

    • On that note, perhaps there could be a TODO/ issue for more trait-level docs/examples.
  • Should the PaddingOverlapError/OutOfBoundsError also contain the Src so it can impl Into<!> if the SplitAtError is matched on instead of converted first? It does add complexity 🤷‍♀

jswrenn added a commit that referenced this issue Mar 28, 2025
By doing so, dynamic overlap checking can be avoided entirely in
many cases.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 28, 2025
By doing so, dynamic overlap checking can be avoided entirely in
many cases.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 28, 2025
By doing so, dynamic overlap checking can be avoided entirely in
many cases.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 29, 2025
By doing so, dynamic overlap checking can be avoided entirely in
many cases.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 29, 2025
By doing so, dynamic overlap checking can be avoided entirely in
many cases.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 29, 2025
By doing so, dynamic overlap checking can be avoided entirely in
many cases.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 29, 2025
By doing so, dynamic overlap checking can be avoided entirely in
many cases.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 29, 2025
By doing so, dynamic overlap checking can be avoided entirely in
many cases.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 29, 2025
By doing so, dynamic overlap checking can be avoided entirely in
many cases.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 29, 2025
By doing so, dynamic overlap checking can be avoided entirely in
many cases.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 29, 2025
By doing so, dynamic overlap checking can be avoided entirely in
many cases.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 29, 2025
By doing so, dynamic overlap checking can be avoided entirely in
many cases.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 29, 2025
By doing so, dynamic overlap checking can be avoided entirely in
many cases.

Makes progress towards #1290.
jswrenn added a commit that referenced this issue Mar 30, 2025
By doing so, dynamic overlap checking can be avoided entirely in
many cases.

Makes progress towards #1290.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants