-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid picture primitive copies via VecHelper #3362
Conversation
webrender/src/util.rs
Outdated
// Matches the definition of SK_ScalarNearlyZero in Skia. | ||
const NEARLY_ZERO: f32 = 1.0 / 4096.0; | ||
|
||
/// A typesafe helper that separates new value construction from | ||
/// vector growing, which allows LLVM to elide the value copy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"and ideally construct the element in place"
|
||
impl<'a, T> Allocation<'a, T> { | ||
#[inline(always)] | ||
pub fn init(self, value: T) -> usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add:
// writing is safe because alloc() ensured enough capacity and Allocation holds a mutable borrow to prevent anyone else from breaking this invariant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this actually avoids the memcpy / memmove? At least for self
arguments a bit back we couldn't work around it like this, see rust-lang/rust#42763.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed in playground. In fn foo()
changing the push(xxx)
to alloc().init(xxx)
removes the memcpy
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I filed rust-lang/rust#56333 for a problem contributing to the first example not working well in the playground.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like SmallVec also causes this to not work. I filed rust-lang/rust#56356 about that.
@@ -338,7 +338,7 @@ impl FrameBuilder { | |||
let mut profile_counters = FrameProfileCounters::new(); | |||
profile_counters | |||
.total_primitives | |||
.set(self.prim_store.prim_count); | |||
.set(self.prim_store.prim_count()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice if this only ran if profiling was on.
83b830f
to
18455ee
Compare
I'm not able to use @jrmuizel 's tool that find memcopies, since IR validation fails on Is my playground experiment incorrect? Otherwise, what assumptions are wrong? |
☔ The latest upstream changes (presumably #3359) made this pull request unmergeable. Please resolve the merge conflicts. |
If you use an enum containing an array instead of an array and initialize
with a small enum variant the copy is elided in your example
|
@jrmuizel indeed, I confirmed it with the playground experiment. |
18455ee
to
d680e32
Compare
@bors-servo r=jrmuizel |
📌 Commit d680e32 has been approved by |
Avoid picture primitive copies via VecHelper This is a successor of #3360 that avoids the borrow checker dance via RAII Addresses part of #3358 <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/3362) <!-- Reviewable:end -->
☀️ Test successful - status-appveyor, status-taskcluster |
It looks like this may not have worked:
I wonder if llvm gets sad the enum variant is too big. |
It looks like SmallVec is contributing to the sadness: #![crate_type = "lib"]
extern crate smallvec;
use smallvec::SmallVec;
#[derive(Default)]
pub struct L {
a: SmallVec<[f64; 16]>,
b: SmallVec<[f64; 16]>,
c: SmallVec<[f64; 16]>
}
pub struct Allocation<T> {
f: *mut T,
}
use std::ptr;
impl<T> Allocation<T> {
pub fn init(self, value: T) {
unsafe { ptr::write(self.f, value) };
}
}
#[inline(never)]
pub fn bar(a: Allocation<L>) {
a.init(L{a: SmallVec::new(), b: SmallVec::new(), c: SmallVec::new()});
} compiles to playground::bar:
pushq %rbx
subq $720, %rsp
movq %rdi, %rbx
xorps %xmm0, %xmm0
movaps %xmm0, 288(%rsp)
movaps %xmm0, (%rsp)
movaps %xmm0, 144(%rsp)
leaq 432(%rsp), %rdi
movq %rsp, %rsi
movl $144, %edx
callq memcpy@PLT
leaq 576(%rsp), %rdi
leaq 144(%rsp), %rsi
movl $144, %edx
callq memcpy@PLT
leaq 288(%rsp), %rsi
movl $432, %edx
movq %rbx, %rdi
callq memcpy@PLT
addq $720, %rsp
popq %rbx
retq |
…4c9170a70ef7 (WR PR #3362). r=kats servo/webrender#3362 Differential Revision: https://phabricator.services.mozilla.com/D13355 --HG-- extra : moz-landing-system : lando
…4c9170a70ef7 (WR PR #3362). r=kats servo/webrender#3362 Differential Revision: https://phabricator.services.mozilla.com/D13355
Reduce data copies during internation Excessive copying in `fn intern()` is something we noticed yesterday with @jrmuizel . This PR attempts to improve them in two ways (each in a separate commit): 1. reduce the `Update` enum size by moving the data out 2. adopt entry-like API to accommodate the common pattern of `if index == v.len() { v.push(xxx) } else { v[index] = xxx; }` without panics between element construction and actual assignment. It builds upon the `VecHelper` work of #3362. <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/3366) <!-- Reviewable:end -->
The smallvec example now generates better code with Nightly: playground::bar: # @playground::bar
# %bb.0:
subq $440, %rsp # imm = 0x1B8
xorps %xmm0, %xmm0
movaps %xmm0, (%rsp)
movaps %xmm0, 144(%rsp)
movaps %xmm0, 288(%rsp)
movq %rsp, %rsi
movl $432, %edx # imm = 0x1B0
callq *memcpy@GOTPCREL(%rip)
addq $440, %rsp # imm = 0x1B8
retq However it looks like there's still an extra memcpy |
I filed a follow up at rust-lang/rust#58082 |
…4c9170a70ef7 (WR PR #3362). r=kats servo/webrender#3362 Differential Revision: https://phabricator.services.mozilla.com/D13355 UltraBlame original commit: 54d56b9a77fb9d41eff3ce1662ba5fff538365d2
…4c9170a70ef7 (WR PR #3362). r=kats servo/webrender#3362 Differential Revision: https://phabricator.services.mozilla.com/D13355 UltraBlame original commit: 54d56b9a77fb9d41eff3ce1662ba5fff538365d2
…4c9170a70ef7 (WR PR #3362). r=kats servo/webrender#3362 Differential Revision: https://phabricator.services.mozilla.com/D13355 UltraBlame original commit: 54d56b9a77fb9d41eff3ce1662ba5fff538365d2
This is a successor of #3360 that avoids the borrow checker dance via RAII
Addresses part of #3358
This change is