Skip to content

quinn-udp: panic in CMsgHdr decode with fast-apple-datapath on MacOS 10.15 #2214

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mxinden opened this issue Apr 28, 2025 · 1 comment
Open

Comments

@mxinden
Copy link
Collaborator

mxinden commented Apr 28, 2025

Documenting early findings thus far. Still debugging.

On Firefox's CI we see MacOS 10.15 failing the following debug_assert_eq.

pub(crate) unsafe fn decode<T: Copy, C: CMsgHdr>(cmsg: &impl CMsgHdr) -> T {
assert!(mem::align_of::<T>() <= mem::align_of::<C>());
debug_assert_eq!(cmsg.len(), C::cmsg_len(mem::size_of::<T>()));
ptr::read(cmsg.cmsg_data() as *const T)
}

[task 2025-04-23T18:01:03.715Z] 18:01:03     INFO -  PID 11647 | [11649] Hit MOZ_CRASH(assertion `left == right` failed
[task 2025-04-23T18:01:03.715Z] 18:01:03     INFO -  PID 11647 |   left: 1626934053
[task 2025-04-23T18:01:03.715Z] 18:01:03     INFO -  PID 11647 |  right: 16) at /builds/worker/checkouts/gecko/third_party/rust/quinn-udp/src/cmsg/mod.rs:86

https://treeherder.mozilla.org/logviewer?job_id=505190482&repo=autoland&lineNumber=7087

//CC @larseggert

@mxinden
Copy link
Collaborator Author

mxinden commented Apr 29, 2025

I believe I found the root of this issue.

We currently don't initialize (zeroize) the control message array before passing it to recvmsg_x:

let mut ctrls = [cmsg::Aligned(MaybeUninit::<[u8; CMSG_LEN]>::uninit()); BATCH_SIZE];

While recvmsg_x overwrites the first couple of entries with valid control messages, it does not zero the ones it doesn't need.

We then use CMSG_NXTHDR to iterate through the control message array:

fn cmsg_nxt_hdr(&self, cmsg: &Self::ControlMessage) -> *mut Self::ControlMessage {
let selfp = self as *const _ as *mut libc::msghdr;
unsafe { libc::CMSG_NXTHDR(selfp, cmsg) }
}

CMSG_NXTHDR should discard the unset control messages based on msg_controllen.

https://docs.rs/libc/latest/src/libc/unix/linux_like/linux/mod.rs.html#6021-6025

But in our case on MacOS 10.15 it does not seem to do it. I assume that MacOS 10.15 simply doesn't set the msg_controllen field.

Thus on MacOS 10.15 we are seeing garbage values like the following:

control_len() = 88
cmsg.cmsg_len = 2638540346
cmsg.cmsg_level = -1270110489
cmsg.cmsg_type = 171422048

Which can, depending on the concrete random values, then trigger the debug_assert_eq! above.

When we zero the control message array instead:

-    let mut ctrls = [cmsg::Aligned(MaybeUninit::<[u8; CMSG_LEN]>::uninit()); BATCH_SIZE];
+    let mut ctrls = [cmsg::Aligned([0u8; CMSG_LEN]); BATCH_SIZE];

cmsg_len on an unset control message will be 0. CMSG_NXTHDR should filter it out:

https://docs.rs/libc/latest/src/libc/unix/linux_like/linux/mod.rs.html#6018-6020

In addition, we have our own safe-guard that will also filter it out:

#[cfg(apple_fast)]
{
// On MacOS < 14 CMSG_NXTHDR might continuously return a zeroed cmsg. In
// such case, return `None` instead, thus indicating the end of
// the cmsghdr chain.
if current.len() < mem::size_of::<M::ControlMessage>() {
return None;
}
}

Note that the Apple recvmsg_x test uses calloc to allocate the control message array, which also zeroes the memory region.

https://github.com/apple-oss-distributions/xnu/blob/8d741a5de7ff4191bf97d57b9f54c2f6d4a15585/tests/recvmsg_x_test.c#L100C1-L100C90

I will create a pull request in a bit. We might want to do the same on the other BSDs.

mxinden added a commit to mxinden/quinn that referenced this issue Apr 29, 2025
`quinn-udp` with fast-apple-datapath previously did not initialize the
control message array memory before passing it to `recvmsg_x`.

On MacOS 10.15 `recvmsg_x` does not seem to set `msg_controllen`. Thus
`CMSG_NXTHDR` reads beyond the control messages written by `recvmsg_x`,
into the unitinialized memory region.

With this commit, the control message array is initialized (with zeroes)
before passing it to `recvmsg_x`, thus no longer reading unset control
messages.

See quinn-rs#2214 for details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant