Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc BTree better, and add some iteration benches #17801

Merged
merged 3 commits into from
Oct 11, 2014
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 77 additions & 1 deletion src/libcollections/btree/map.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,47 @@ use ringbuf::RingBuf;


/// A map based on a B-Tree.
///
/// B-Trees represent a fundamental compromise between cache-efficiency and actually minimizing
/// the amount of work performed in a search. In theory, a binary search tree (BST) is the optimal
/// choice for a sorted map, as a perfectly balanced BST performs the theoretical minimum amount of
/// comparisons necessary to find an element (log<sub>2</sub>n). However, in practice the way this
/// is done is *very* inefficient for modern computer architectures. In particular, every element
/// is stored in its own individually heap-allocated node. This means that every single insertion
/// triggers a heap-allocation, and every single comparison should be a cache-miss. Since these
/// are both notably expensive things to do in practice, we are forced to at very least reconsider
/// the BST strategy.
///
/// A B-Tree instead makes each node contain B-1 to 2B-1 elements in a contiguous array. By doing
/// this, we reduce the number of allocations by a factor of B, and improve cache effeciency in
/// searches. However, this does mean that searches will have to do *more* comparisons on average.
/// The precise number of comparisons depends on the node search strategy used. For optimal cache
/// effeciency, one could search the nodes linearly. For optimal comparisons, one could search
/// the node using binary search. As a compromise, one could also perform a linear search
/// that initially only checks every i<sup>th</sup> element for some choice of i.
///
/// Currently, our implementation simply performs naive linear search. This provides excellent
/// performance on *small* nodes of elements which are cheap to compare. However in the future we
/// would like to further explore choosing the optimal search strategy based on the choice of B,
/// and possibly other factors. Using linear search, searching for a random element is expected
/// to take O(B log<sub>B</sub>n) comparisons, which is generally worse than a BST. In practice,
/// however, performance is excellent. `BTreeMap` is able to readily outperform `TreeMap` under
/// many workloads, and is competetive where it doesn't. BTreeMap also generally *scales* better
/// than TreeMap, making it more appropriate for large datasets.
///
/// However, `TreeMap` may still be more appropriate to use in many contexts. If elements are very
/// large or expensive to compare, `TreeMap` may be more appropriate. It won't allocate any
/// more space than is needed, and will perform the minimal number of comparisons necessary.
/// `TreeMap` also provides much better performance stability guarantees. Generally, very few
/// changes need to be made to update a BST, and two updates are expected to take about the same
/// amount of time on roughly equal sized BSTs. However a B-Tree's performance is much more
/// amortized. If a node is overfull, it must be split into two nodes. If a node is underfull, it
/// may be merged with another. Both of these operations are relatively expensive to perform, and
/// it's possible to force one to occur at every single level of the tree in a single insertion or
/// deletion. In fact, a malicious or otherwise unlucky sequence of insertions and deletions can
/// force this degenerate behaviour to occur on every operation. While the total amount of work
/// done on each operation isn't *catastrophic*, and *is* still bounded by O(B log<sub>B</sub>n),
/// it is certainly much slower when it does.
#[deriving(Clone)]
pub struct BTreeMap<K, V> {
root: Node<K, V>,
Expand Down Expand Up @@ -93,6 +134,8 @@ impl<K: Ord, V> BTreeMap<K, V> {
}

/// Makes a new empty BTreeMap with the given B.
///
/// B cannot be less than 2.
pub fn with_b(b: uint) -> BTreeMap<K, V> {
assert!(b > 1, "B must be greater than 1");
BTreeMap {
Expand Down Expand Up @@ -1145,9 +1188,12 @@ mod test {

#[cfg(test)]
mod bench {
use test::Bencher;
use std::prelude::*;
use std::rand::{weak_rng, Rng};
use test::{Bencher, black_box};

use super::BTreeMap;
use MutableMap;
use deque::bench::{insert_rand_n, insert_seq_n, find_rand_n, find_seq_n};

#[bench]
Expand Down Expand Up @@ -1200,4 +1246,34 @@ mod bench {
let mut m : BTreeMap<uint,uint> = BTreeMap::new();
find_seq_n(10_000, &mut m, b);
}

fn bench_iter(b: &mut Bencher, size: uint) {
let mut map = BTreeMap::<uint, uint>::new();
let mut rng = weak_rng();

for _ in range(0, size) {
map.swap(rng.gen(), rng.gen());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to go for swap instead of insert?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No joke, there was actually a thought-process here: Because I expect insert to be deprecated in the future. If the changes are performed mechanically, this code could get transformed into something weird (since we don't care about the return value). It's a bit silly, honestly, but swap and insert are functionally identical so it doesn't really matter and I figured I'd play it safe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, I'm not sure I'm on board with deprecating the name insert, but okay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal in the collections reform RFC is swap would be renamed insert. In my head all inserts then gets translated to insert().is_some()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, cool, that's what I was hoping.

}

b.iter(|| {
for entry in map.iter() {
black_box(entry);
}
});
}

#[bench]
pub fn iter_20(b: &mut Bencher) {
bench_iter(b, 20);
}

#[bench]
pub fn iter_1000(b: &mut Bencher) {
bench_iter(b, 1000);
}

#[bench]
pub fn iter_100000(b: &mut Bencher) {
bench_iter(b, 100000);
}
}
5 changes: 5 additions & 0 deletions src/libcollections/btree/set.rs
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,9 @@ use core::fmt::Show;
use {Mutable, Set, MutableSet, MutableMap, Map};

/// A set based on a B-Tree.
///
/// See BTreeMap's documentation for a detailed discussion of this collection's performance
/// benefits and drawbacks.
#[deriving(Clone, Hash, PartialEq, Eq, Ord, PartialOrd)]
pub struct BTreeSet<T>{
map: BTreeMap<T, ()>,
Expand Down Expand Up @@ -65,6 +68,8 @@ impl<T: Ord> BTreeSet<T> {
}

/// Makes a new BTreeSet with the given B.
///
/// B cannot be less than 2.
pub fn with_b(b: uint) -> BTreeSet<T> {
BTreeSet { map: BTreeMap::with_b(b) }
}
Expand Down
35 changes: 34 additions & 1 deletion src/libcollections/treemap.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2232,9 +2232,12 @@ mod test_treemap {

#[cfg(test)]
mod bench {
use test::Bencher;
use std::prelude::*;
use std::rand::{weak_rng, Rng};
use test::{Bencher, black_box};

use super::TreeMap;
use MutableMap;
use deque::bench::{insert_rand_n, insert_seq_n, find_rand_n, find_seq_n};

// Find seq
Expand Down Expand Up @@ -2288,6 +2291,36 @@ mod bench {
let mut m : TreeMap<uint,uint> = TreeMap::new();
find_seq_n(10_000, &mut m, b);
}

fn bench_iter(b: &mut Bencher, size: uint) {
let mut map = TreeMap::<uint, uint>::new();
let mut rng = weak_rng();

for _ in range(0, size) {
map.swap(rng.gen(), rng.gen());
}

b.iter(|| {
for entry in map.iter() {
black_box(entry);
}
});
}

#[bench]
pub fn iter_20(b: &mut Bencher) {
bench_iter(b, 20);
}

#[bench]
pub fn iter_1000(b: &mut Bencher) {
bench_iter(b, 1000);
}

#[bench]
pub fn iter_100000(b: &mut Bencher) {
bench_iter(b, 100000);
}
}

#[cfg(test)]
Expand Down
40 changes: 24 additions & 16 deletions src/libcollections/trie.rs
Original file line number Diff line number Diff line change
Expand Up @@ -948,8 +948,8 @@ macro_rules! iterator_impl {
// rules, and are just manipulating raw pointers like there's no
// such thing as invalid pointers and memory unsafety. The
// reason is performance, without doing this we can get the
// bench_iter_large microbenchmark down to about 30000 ns/iter
// (using .unsafe_get to index self.stack directly, 38000
// (now replaced) bench_iter_large microbenchmark down to about
// 30000 ns/iter (using .unsafe_get to index self.stack directly, 38000
// ns/iter with [] checked indexing), but this smashes that down
// to 13500 ns/iter.
//
Expand Down Expand Up @@ -1458,31 +1458,39 @@ mod test_map {
mod bench_map {
use std::prelude::*;
use std::rand::{weak_rng, Rng};
use test::Bencher;
use test::{Bencher, black_box};

use MutableMap;
use super::TrieMap;

#[bench]
fn bench_iter_small(b: &mut Bencher) {
let mut m = TrieMap::<uint>::new();
fn bench_iter(b: &mut Bencher, size: uint) {
let mut map = TrieMap::<uint>::new();
let mut rng = weak_rng();
for _ in range(0u, 20) {
m.insert(rng.gen(), rng.gen());

for _ in range(0, size) {
map.swap(rng.gen(), rng.gen());
}

b.iter(|| for _ in m.iter() {})
b.iter(|| {
for entry in map.iter() {
black_box(entry);
}
});
}

#[bench]
fn bench_iter_large(b: &mut Bencher) {
let mut m = TrieMap::<uint>::new();
let mut rng = weak_rng();
for _ in range(0u, 1000) {
m.insert(rng.gen(), rng.gen());
}
pub fn iter_20(b: &mut Bencher) {
bench_iter(b, 20);
}

b.iter(|| for _ in m.iter() {})
#[bench]
pub fn iter_1000(b: &mut Bencher) {
bench_iter(b, 1000);
}

#[bench]
pub fn iter_100000(b: &mut Bencher) {
bench_iter(b, 100000);
}

#[bench]
Expand Down