Skip to content

Commit 0f9317d

Browse files
committed
Auto merge of #43595 - oyvindln:master, r=aturon
Add an overflow check in the Iter::next() impl for Range<_> to help with vectorization. This helps with vectorization in some cases, such as (0..u16::MAX).collect::<Vec<u16>>(), as LLVM is able to change the loop condition to use equality instead of less than and should help with #43124. (See also my [last comment](#43124 (comment)) there.) This PR makes collect on ranges of u16, i16, i8, and u8 **significantly** faster (at least on x86-64 and i686), and pretty close, though not quite equivalent to a [manual unsafe implementation](https://is.gd/nkoecB). 32 ( and 64-bit values on x86-64) bit values were already vectorized without this change, and they still are. This PR doesn't seem to help with 64-bit values on i686, as they still don't vectorize well compared to doing a manual loop. I'm a bit unsure if this was the best way of implementing this, I tried to do it with as little changes as possible and avoided changing the step trait and the behavior in RangeFrom (I'll leave that for others like #43127 to discuss wider changes to the trait). I tried simply changing the comparison to `self.start != self.end` though that made the compiler segfault when compiling stage0, so I went with this method instead for now. As for `next_back()`, reverse ranges seem to optimise properly already.
2 parents 78efb23 + 4bb9a8b commit 0f9317d

File tree

1 file changed

+10
-3
lines changed

1 file changed

+10
-3
lines changed

src/libcore/iter/range.rs

+10-3
Original file line numberDiff line numberDiff line change
@@ -214,9 +214,16 @@ impl<A: Step> Iterator for ops::Range<A> {
214214
#[inline]
215215
fn next(&mut self) -> Option<A> {
216216
if self.start < self.end {
217-
let mut n = self.start.add_one();
218-
mem::swap(&mut n, &mut self.start);
219-
Some(n)
217+
// We check for overflow here, even though it can't actually
218+
// happen. Adding this check does however help llvm vectorize loops
219+
// for some ranges that don't get vectorized otherwise,
220+
// and this won't actually result in an extra check in an optimized build.
221+
if let Some(mut n) = self.start.add_usize(1) {
222+
mem::swap(&mut n, &mut self.start);
223+
Some(n)
224+
} else {
225+
None
226+
}
220227
} else {
221228
None
222229
}

0 commit comments

Comments
 (0)