Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VecDeque::split_off has unexpectedly poor performance characteristics when splitting off from the front. #127281

Open
emilio opened this issue Jul 3, 2024 · 2 comments
Labels
C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@emilio
Copy link
Contributor

emilio commented Jul 3, 2024

I tried this code:

use std::collections::VecDeque;

fn huge_queue() -> VecDeque<usize> {
    const HUGE: usize = 1000000;
    let mut queue = VecDeque::with_capacity(HUGE);
    for i in 0..HUGE {
        queue.push_back(i);
    }
    queue
}

const CHUNK: usize = 123;
const FAST: bool = false;

fn main() {
    let mut queue = huge_queue();
    if FAST {
        while queue.len() > CHUNK {
            queue.rotate_left(CHUNK);
            let rest = queue.split_off(queue.len() - CHUNK);
            println!("Chunk of {}", rest.len());
        }
    } else {
        while queue.len() > CHUNK {
            let rest = queue.split_off(CHUNK);
            println!("Chunk of {}", queue.len());
            queue = rest;
        }
    }
    println!("Remaining: {}", queue.len());
}

I expected the FAST branch and the else branch to have comparable performance, instead it seems VecDeque::split_off from the front is much slower than expected (as one would expect of e.g. Vec::split_off which needs to shift all the elements).

@emilio emilio added the C-bug Category: This is a bug. label Jul 3, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jul 3, 2024
i3roly pushed a commit to i3roly/firefox-dynasty that referenced this issue Jul 9, 2024
…sal. r=dshin

Using split_off to split the front chunk from the queue seems like it'd
be efficient, but it's not:

  rust-lang/rust#127281

Instead, pass the range of nodes to distribute, and use
VecDeque::from_iter in the callee. Then, just truncate() the queue to
the remaining local work.

Differential Revision: https://phabricator.services.mozilla.com/D215705
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Jul 9, 2024
…sal. r=dshin

Using split_off to split the front chunk from the queue seems like it'd
be efficient, but it's not:

  rust-lang/rust#127281

Instead, pass the range of nodes to distribute, and use
VecDeque::from_iter in the callee. Then, just truncate() the queue to
the remaining local work.

Differential Revision: https://phabricator.services.mozilla.com/D215705
github-actions bot pushed a commit to servo/stylo that referenced this issue Jul 9, 2024
…sal. r=dshin

Using split_off to split the front chunk from the queue seems like it'd
be efficient, but it's not:

  rust-lang/rust#127281

Instead, pass the range of nodes to distribute, and use
VecDeque::from_iter in the callee. Then, just truncate() the queue to
the remaining local work.

Differential Revision: https://phabricator.services.mozilla.com/D215705
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Jul 9, 2024
…sal. r=dshin, a=dmeehan

Using split_off to split the front chunk from the queue seems like it'd
be efficient, but it's not:

  rust-lang/rust#127281

Instead, pass the range of nodes to distribute, and use
VecDeque::from_iter in the callee. Then, just truncate() the queue to
the remaining local work.

Differential Revision: https://phabricator.services.mozilla.com/D215705
i3roly pushed a commit to i3roly/firefox-dynasty that referenced this issue Jul 9, 2024
…sal. r=dshin, a=dmeehan

Using split_off to split the front chunk from the queue seems like it'd
be efficient, but it's not:

  rust-lang/rust#127281

Instead, pass the range of nodes to distribute, and use
VecDeque::from_iter in the callee. Then, just truncate() the queue to
the remaining local work.

Differential Revision: https://phabricator.services.mozilla.com/D215705
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Jul 11, 2024
…sal. r=dshin, a=dmeehan

Using split_off to split the front chunk from the queue seems like it'd
be efficient, but it's not:

  rust-lang/rust#127281

Instead, pass the range of nodes to distribute, and use
VecDeque::from_iter in the callee. Then, just truncate() the queue to
the remaining local work.

Differential Revision: https://phabricator.services.mozilla.com/D215705
ErichDonGubler pushed a commit to erichdongubler-mozilla/firefox that referenced this issue Jul 12, 2024
…sal. r=dshin

Using split_off to split the front chunk from the queue seems like it'd
be efficient, but it's not:

  rust-lang/rust#127281

Instead, pass the range of nodes to distribute, and use
VecDeque::from_iter in the callee. Then, just truncate() the queue to
the remaining local work.

Differential Revision: https://phabricator.services.mozilla.com/D215705
i3roly pushed a commit to i3roly/firefox-dynasty that referenced this issue Jul 13, 2024
…sal. r=dshin, a=dmeehan

Using split_off to split the front chunk from the queue seems like it'd
be efficient, but it's not:

  rust-lang/rust#127281

Instead, pass the range of nodes to distribute, and use
VecDeque::from_iter in the callee. Then, just truncate() the queue to
the remaining local work.

Differential Revision: https://phabricator.services.mozilla.com/D215705
@jieyouxu jieyouxu added I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Aug 5, 2024
@jieyouxu jieyouxu removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Aug 13, 2024
surapunoyousei pushed a commit to Floorp-Projects/Floorp that referenced this issue Aug 21, 2024
…sal. r=dshin, a=dmeehan

Using split_off to split the front chunk from the queue seems like it'd
be efficient, but it's not:

  rust-lang/rust#127281

Instead, pass the range of nodes to distribute, and use
VecDeque::from_iter in the callee. Then, just truncate() the queue to
the remaining local work.

Differential Revision: https://phabricator.services.mozilla.com/D215705
@cuviper
Copy link
Member

cuviper commented Sep 25, 2024

        while queue.len() > CHUNK {
            let rest = queue.split_off(CHUNK);
            println!("Chunk of {}", queue.len());
            queue = rest;
        }

If I understand, your expectation was that rest would actually get the original allocation, and queue would be replaced with a new allocation for the CHUNK items (before you assign it back). So only the smaller size would need a memory copy. Right?

Note that the docs explicitly say "Returns a newly allocated VecDeque," and "Note that the capacity of self does not change."

We recently went over a similar case with Vec::split_off(0) where #76682 had optimized it to avoid any memory copy, but issue #119913 and pull request #119917 reverted that. VecDeque should not necessarily be considered the same way, given different expectations about ring buffers, but maybe we should add VecDeque::split_off_front that explicitly keeps self as the back portion with the original allocation.

@emilio
Copy link
Contributor Author

emilio commented Sep 25, 2024

If I understand, your expectation was that rest would actually get the original allocation, and queue would be replaced with a new allocation for the CHUNK items (before you assign it back). So only the smaller size would need a memory copy. Right?

Correct.

Note that the docs explicitly say "Returns a newly allocated VecDeque," and "Note that the capacity of self does not change."

Yes, and I think that's unexpected. I agree it's documented :)

maybe we should add VecDeque::split_off_front that explicitly keeps self as the back portion with the original allocation.

That would be reasonable.

glandium pushed a commit to glandium/firefox that referenced this issue Mar 11, 2025
…sal. r=dshin, a=dmeehan

Using split_off to split the front chunk from the queue seems like it'd
be efficient, but it's not:

  rust-lang/rust#127281

Instead, pass the range of nodes to distribute, and use
VecDeque::from_iter in the callee. Then, just truncate() the queue to
the remaining local work.

Differential Revision: https://phabricator.services.mozilla.com/D215705
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants