Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement __isPlatformVersionAtLeast and __isOSVersionAtLeast #794

Closed
wants to merge 1 commit into from

Conversation

madsmtm
Copy link

@madsmtm madsmtm commented Mar 18, 2025

Motivation

When Objective-C code uses @available(...), Clang inserts a call to __isPlatformVersionAtLeast (__isOSVersionAtLeast in older Clang versions). These symbols not being available in compiler-builtins sometimes ends up causing linker errors.

The workaround is to link libclang_rt.osx.a, see e.g. alexcrichton/curl-rust#279. But that's very difficult for users to figure out (and the backreferences to that issue indicates that people are still running into this in their own projects every so often).

For a recent example, this is preventing rustc from using LLVM assertions on macOS, see rust-lang/rust#62592 (comment) and rust-lang/rust#134275 (comment).

Apart from linker errors above, it is also a blocker for setting the correct minimum OS version in cc-rs, which is a huge correctness footgun: By default, if using e.g. __builtin_available(macos 10.15, *), the symbol usually happens to be left out, since clang defaults to compiling for the host macOS version, and thus things seem to work. But there's a reason why people write availability checks, which is that otherwise the code won't work on the older OS! But if we want to fix this in cc-rs, we might end up introducing linker errors in places where they weren't before.

Example crate where this behaviour is illustrated
# Cargo.toml
[package]
name = "foo"
version = "0.0.0"
edition = "2024"

[build-dependencies]
cc = "1.0"
// foo.c
int foo(void) {
    // Use some API that's a lot newer than the host machine
    if (__builtin_available(macos 20.0, *)) {
        return 1;
    } else {
        return 0;
    }
}
// build.rs
fn main() {
    println!("cargo:rerun-if-changed=foo.c");
    cc::Build::new().file("foo.c").compile("foo");
}
// main.rs
unsafe extern "C" {
    safe fn foo() -> core::ffi::c_int;
}

fn main() {
    println!("{}", foo());
}

(Finally, I'll reveal that my super secret evil agenda is to expose some variant of @available/__builtin_available in Rust's std, and that'll probably be easier if the bulk of the implementation is already here in compiler-builtins. But I believe adding this here has value regardless of those future plans).

Solution

Implement __isPlatformVersionAtLeast and __isOSVersionAtLeast, see the code comments for more details. Of particular note:

I've choosen to stray a bit from LLVM's upstream implementation, and not use _availability_version_check since it has problems when compiling with an older SDK. Instead, we use sysctlbyname when available to still avoid the costly PList lookup in most cases, but still with a fall back to the PList lookup when that is not available (this fallback is similar to LLVM's implementation).

I've also preferred to panic when hitting an unexpected state, instead of silently returning a wrong value, but am unsure what your policies are around this?

Testing

Tested using the equivalent of cargo test --manifest-path testcrate/Cargo.toml --test os_version_check on the following configurations:

  • macOS 14.7.3 on a Macbook Pro M2
    • aarch64-apple-darwin
    • x86_64-apple-darwin (under Rosetta)
    • aarch64-apple-ios-macabi
    • x86_64-apple-ios-macabi (under Rosetta)
    • aarch64-apple-ios (using Xcode's "Designed for iPad" setting)
    • aarch64-apple-ios-sim (in iOS Simulator, as iPhone with iOS 17.5)
    • aarch64-apple-ios-sim (in iOS Simulator, as iPad with iOS 18.2)
    • aarch64-apple-tvos-sim (in tvOS Simulator)
    • aarch64-apple-watchos-sim (in watchOS Simulator)
    • aarch64-apple-ios-sim (in visionOS simulator, using Xcode's "Designed for iPad" setting)
    • aarch64-apple-visionos-sim (in visionOS Simulator)
  • macOS 15.3.1 VM
    • aarch64-apple-darwin
    • aarch64-apple-ios-macabi
  • macOS 10.12.6 on an Intel Macbook from 2013
    • x86_64-apple-darwin
    • i686-apple-darwin
    • x86_64-apple-ios (in iOS Simulator)
  • iOS 9.3.6 on a 1st generation iPad Mini
    • armv7-apple-ios with an older compiler

Along with manually inspecting the output of version_from_sysctl() and version_from_plist(), and verifying that they actually match what's expected.

I believe the only real omissions here would be:

  • aarch64-apple-ios on a newer iPhone that has sysctl available (iOS 11.4 or above).
  • aarch64-apple-ios on a Vision Pro using Xcode's "Designed for iPad" setting.

But I don't have the hardware available to test those.

@madsmtm madsmtm force-pushed the darwin-os_version_check branch from 1e9e3cb to 3425b27 Compare March 18, 2025 03:21
@madsmtm
Copy link
Author

madsmtm commented Mar 18, 2025

I don't have perms to do:
@rustbot ping apple

So gonna ping people from that notification group manually:
CC @hkratz, @inflation, @shepmaster, @thomcc, @nvzqz, @madsmtm, @BlackHoleFox

@rustbot
Copy link

rustbot commented Mar 18, 2025

Error: This repository is not enabled to use triagebot.
Add a triagebot.toml in the root of the default branch to enable it.

Please file an issue on GitHub at triagebot if there's a problem with this bot, or reach out on #t-infra on Zulip.

@madsmtm madsmtm force-pushed the darwin-os_version_check branch 2 times, most recently from 1aed11c to 8e9c8c4 Compare March 18, 2025 03:25
@inflation
Copy link

Slightly off-topic, I’ve also encountered issues when library need symbols from clang_rt on macOS, e.g. x86 feature detection from the underlying C library. Would there be a better way to let the downstream knows we need clang_rt linked?

Comment on lines +56 to +60
// Test internals

#[path = "../../src/os_version_check/darwin_impl.rs"]
#[allow(dead_code)]
mod darwin_impl;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit unsure about this, should I make it a unit-test inside src/os_version_check/darwin_impl.rs instead of an integration test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit tests unfortunately don't really work in compiler-builtins. It is fine to just make everything public so you don't need to do the module path trick.

Comment on lines +556 to +590
#[track_caller]
const fn parse_usize(mut bytes: &[u8]) -> (usize, &[u8]) {
// Ensure we have at least one digit (that is not just a period).
let mut ret: usize = if let Some((&ascii, rest)) = bytes.split_first() {
bytes = rest;

match ascii {
b'0'..=b'9' => (ascii - b'0') as usize,
_ => panic!("found invalid digit when parsing version"),
}
} else {
panic!("found empty version number part")
};

// Parse the remaining digits.
while let Some((&ascii, rest)) = bytes.split_first() {
let digit = match ascii {
b'0'..=b'9' => ascii - b'0',
_ => break,
};

bytes = rest;

// This handles leading zeroes as well.
match ret.checked_mul(10) {
Some(val) => match val.checked_add(digit as _) {
Some(val) => ret = val,
None => panic!("version is too large"),
},
None => panic!("version is too large"),
};
}

(ret, bytes)
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfamiliar with compiler-builtins, so I can't tell if there's a better way of doing this kind of parsing?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things are helped by this being specific to Apple platforms, so we already depend on libSystem.dylib, and could use for example sscanf. What do you prefer?

Copy link
Author

@madsmtm madsmtm Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think I'd prefer that myself, it's also what LLVM does (the manual parsing here was a remnant of my implementation in objc2).

Will wait 'till I hear back, and then I'll probably use sscanf instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a closer look at the parsing in a bit, but it is probably better not to depend on sscanf for the reason mentioned in #794 (comment).

@madsmtm
Copy link
Author

madsmtm commented Mar 18, 2025

Slightly off-topic, I’ve also encountered issues when library need symbols from clang_rt on macOS, e.g. x86 feature detection from the underlying C library. Would there be a better way to let the downstream knows we need clang_rt linked?

Unsure, I haven't really looked into where Clang / Xcode has its runtime libraries for different archs. In any case, what you're discussing is probably more relevant to #296.


#[cold]
// Use `extern "C"` to abort on panic, allowing `current_version` to be free of panic handling.
pub(super) extern "C" fn lookup_version() -> NonZero<OSVersion> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this extern C?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a micro-optimization, can remove it if you want.

Details: Optimally, we inline __isPlatformVersionAtLeast and current_version (requires LTO), and thus end up with a direct call to lookup_version. By marking it extern "C", we force any panics from version_from_sysctl/version_from_plist to be turned into aborts (without having to deal with std::panic::catch_unwind), which improves code-size (since the now-inlined caller doesn't have to emit unwind handling).

/// Panics if reading or parsing the version fails (or if the system was out of memory).
///
/// We deliberately choose to panic, as having this silently return an invalid OS version would be
/// impossible for a user to debug.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compiler-builtins may not panic as that requires libcore but compiler-builtins can't depend on any other crates as it is last on the liker cli. Instead you will have to abort.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as it is last on the liker cli

Pretty sure Apple's linker doesn't require ordering in the same way as traditional ELF linkers?

I'll have to test it to be sure though, I'll see if I can build a custom toolchain with this branch of compiler-builtins.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if the linker accepts it, you run the risk of cyclic dependencies with the panic handler (as __isPlatformVersionAtLeast panics, which calls the panic handler, which in turn calls a function that calls __isPlatformVersionAtLeast).

// SAFETY: Same signatures as in `libc`.
//
// NOTE: We do not need to link these; that will be done by `std` by linking `libSystem`
// (which is required on macOS/Darwin).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the user is building a no_std macOS application?

Copy link
Author

@madsmtm madsmtm Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fairly sure even no_std requires libSystem.dylib (or libc.dylib or similar, which exports many of the same symbols).

For example, compiling the following with rustc no_std.rs -Cpanic=abort:

#![no_std]
#![no_main]

#[panic_handler]
fn handler(_info: &core::panic::PanicInfo<'_>) -> ! {
    loop {}
}

#[no_mangle]
fn main() {}

Results in:

Undefined symbols for architecture arm64:
  "dyld_stub_binder", referenced from:
      <initial-undefines>
ld: symbol(s) not found for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

That is, fundamentally, building a Mach-O dylib (of which binaries are a subset) requires the dyld_stub_binder symbol (I think it'd inserted by the linker?).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait... I think I may be mistaken here.

Seems like using MACOSX_DEPLOYMENT_TARGET=12.0 doesn't require the symbol! So you're probably right that we'd need to add a #[link(name = "System")].

@tgross35
Copy link
Contributor

(Finally, I'll reveal that my super secret evil agenda is to expose some variant of @available/__builtin_available in Rust's std, and that'll probably be easier if the bulk of the implementation is already here in compiler-builtins. But I believe adding this here has value regardless of those future plans).

Could it make sense to provide this as a weak symbol in std for now, and then consider exposing that directly later? We may be able to hack around the potential linking issues in compiler-builtins but if doing things correctly requires env / filesystem / allocation, then it would be much easier to only provide these implementations when those things are available. Afaik there isn't any precedent for doing this, but there also aren't any cases of using libc or extern symbols in compiler-builtins (__rustc_probestack being the only exception).

For similar reasons I am assuming LLVM would also prefer these to be libc or libgcc functions rather than compiler-rt, but doesn't have control over those things on Apple platforms to add them.

This might be worth bringing up on Zulip, and/or just submitting an ACP to add a version of __builtin_available.

@madsmtm
Copy link
Author

madsmtm commented Mar 18, 2025

Could it make sense to provide this as a weak symbol in std for now, and then consider exposing that directly later?

I'm not familiar enough with the linking constraints that compiler-builtins are under to be sure? Would making it a weak symbol enough to not conflict with the symbol in libclang_rt.dylib?

To be clear, I'd be fine with requiring std to not get the linker errors I discussed above, I only proposed it here to mirror compiler-rt. Is the only reason to have something in compiler-builtins to allow #![no_core] to use it too?

if doing things correctly requires env / filesystem / allocation

It seems crazy to me too, but yes, it does require all of these.

For similar reasons I am assuming LLVM would also prefer these to be libc or libgcc functions rather than compiler-rt, but doesn't have control over those things on Apple platforms to add them.

Ah, yeah, thinking about it, I suspect they added it to compiler-rt such that it could work even on older OSes (since libc/libSystem are shipped with the OS). But we don't have that same constraint with std.

Actually, does compiler built-ins in Clang that call libc functions exist? Nvmd, allocation. Yeah, in that case it's stupid to have this in compiler-rt.

This might be worth bringing up on Zulip, and/or just submitting an ACP to add a version of __builtin_available.

I've opened a Zulip thread, will submit an ACP for a version of __builtin_available later.

@tgross35
Copy link
Contributor

tgross35 commented Mar 18, 2025

I'm not familiar enough with the linking constraints that compiler-builtins are under to be sure? Would making it a weak symbol enough to not conflict with the symbol in libclang_rt.dylib?

This has to be tested but I think it should. Amanieu mentioned before that static+weak symbols may get picked over dynamic+strong, so ours might get favored. But that doesn't seem like a problem.

To be clear, I'd be fine with requiring std to not get the linker errors I discussed above, I only proposed it here to mirror compiler-rt. Is the only reason to have something in compiler-builtins to allow #![no_core] to use it too?

compiler-builtins doesn't actually get added by default for #![no_core] https://github.com/rust-lang/rust/blob/75530e9f72a1990ed2305e16fd51d02f47048f12/compiler/rustc_metadata/src/creader.rs#L1116-L1122. I think this crate exists mostly for organization, the compiler expects a handful of extern "C" functions to be available and they probably could be in core but it's convenient to keep them separate.

@madsmtm
Copy link
Author

madsmtm commented Mar 25, 2025

I have opened rust-lang/rust#138944 instead, will pursue that and close this for now, thanks for the input so far!

I'm not familiar enough with the linking constraints that compiler-builtins are under to be sure? Would making it a weak symbol enough to not conflict with the symbol in libclang_rt.dylib?

This has to be tested but I think it should. Amanieu mentioned before that static+weak symbols may get picked over dynamic+strong, so ours might get favored. But that doesn't seem like a problem.

Turns out I was a little bit mistaken in my comment here, libclang_rt is actually libclang_rt.*.a, i.e. a static library.

@madsmtm madsmtm closed this Mar 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants