Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unwrap_or_else(|| debug_unreachable_unchecked()) still generates jump instructions #5064

Closed
james7132 opened this issue Jun 21, 2022 · 4 comments
Labels
A-ECS Entities, components, systems, and events C-Code-Quality A section of code that is hard to understand or change C-Performance A change motivated by improving speed, memory usage or compile times

Comments

@james7132
Copy link
Member

What problem does this solve or what need does it fill?

When writing the core Fetch related code, there's a large amount of unwrap_or_else(|| debug_unreachable_unchecked() to suggest to the compiler that, in release builds, the Option is never None and unwrapping is always safe. This is supposed to hint to the compiler that this branch is dead.

However, when attempting to compare this pattern against an actual get_unchecked implementation for SparseSets seems to still generates two jump instructions (at least on x86 platforms). You can see this here: https://godbolt.org/z/e3EPE1vxW, the full assembly output has been copied below.

example::get_checked:
        mov     rax, qword ptr [rdi + 24]
        shl     rsi, 4
        mov     rax, qword ptr [rax + rsi + 8]
        shl     rax, 3
        add     rax, qword ptr [rdi]
        ret

example::get_unwrap_checked_unreachable:
        mov     rcx, rsi
        shl     rcx, 4
        add     rcx, qword ptr [rdi + 24]
        xor     eax, eax
        cmp     qword ptr [rdi + 40], rsi
        cmovbe  rcx, rax
        jbe     .LBB1_4
        cmp     qword ptr [rcx], 0
        je      .LBB1_2
        mov     rax, qword ptr [rcx + 8]
        shl     rax, 3
        add     rax, qword ptr [rdi]
.LBB1_4:
        ret
.LBB1_2:
        xor     eax, eax
        ret

What solution would you like?

Internal, pub(crate) get_unchecked(_mut) variants on SparseSet and derivative metadata stores and storages that is only used for release builds.

What alternative(s) have you considered?

Leaving this untouched. Eat the performance penalty.

@james7132 james7132 added C-Feature A new feature, making something new possible A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times labels Jun 21, 2022
@james7132
Copy link
Member Author

A bit more digging, and it seems like the local Option<ThinSlicePtr<'_, T>> unwraps are just fine, they compile down to the same assembly, but anything that was fetched from a metadata store (i.e. Tables/SparseSets/etc) seems to be blocked off by the function boundary. It seems like it's inlining the function and then not doing further optimizations. This likely should go away with LTO or some other secondary optimization pass, but we probably shouldn't rely on that.

@alice-i-cecile alice-i-cecile added C-Code-Quality A section of code that is hard to understand or change and removed C-Feature A new feature, making something new possible labels Jun 21, 2022
@james7132
Copy link
Member Author

Upstreamed an issue for this rust-lang/rust#98468

@james7132
Copy link
Member Author

This seems to be fixed in the latest nightly: https://godbolt.org/z/9PzKWfY1K. Probably can close this when Rust 1.64 lands.

@james7132
Copy link
Member Author

This seems to have been fixed as of 1.64's release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Code-Quality A section of code that is hard to understand or change C-Performance A change motivated by improving speed, memory usage or compile times
Projects
None yet
Development

No branches or pull requests

2 participants