Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "nth_element" function for [T] #1470

Closed
Mokosha opened this issue Jan 21, 2016 · 12 comments
Closed

Add "nth_element" function for [T] #1470

Mokosha opened this issue Jan 21, 2016 · 12 comments
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.

Comments

@Mokosha
Copy link

Mokosha commented Jan 21, 2016

Apologies if this has been brought up before -- a cursory search with google brought up nothing.

Analogous to the std::nth_element function in C++, Rust slices should be able to do a partial quicksort to just get to the nth smallest element. I believe that this is fairly straightforward and analogous to the C++ implementation, and people have already mentioned it a few times. I suspect the function signatures would look something like this:

impl<T> [T] {
  fn nth_element(&mut self, n: usize) where T : Ord;
  fn nth_element_by<F>(&mut self, n: usize, f: F) where F: FnMut(&T, &T) -> Ordering;
}

I think the name nth_element is a bit clunky though, so perhaps sort_at_element would be better? In particular, I've linked the paper used in most C++ implementations for the introsort algorithm.

@Mokosha Mokosha changed the title Add quickselect implementation for Vec Add quickselect implementation for [T] Jan 21, 2016
@ticki
Copy link
Contributor

ticki commented Jan 21, 2016

👍. Maybe sort_n or partial_sort?

@ranma42
Copy link
Contributor

ranma42 commented Jan 21, 2016

I would specify the operation instead of the algorithm. (why quickselect instead of other selection algorithms? I would try to avoid locking into a specific one, so that we can decide based on benchmarks and possibly change our choice).

Also notice that usually a selection algorithm does not sort the head nor the tail of the array. In most cases it only guarantees that they are respectively <= and >= of the selected element.
Maybe partition_at_nth might be a good name for this operation?

We might also want a partial_sort, but that is another operation, which is easily obtained on top of partition_at_nth + sort (of the desired slice).

@ticki
Copy link
Contributor

ticki commented Jan 21, 2016

Right, the name shouldn't expose the machinery behind it (this is the same reason sort() is named as it is).

@ranma42 partition_at_nth requires a full reorganization of the vector, which is not intended.

@ranma42
Copy link
Contributor

ranma42 commented Jan 21, 2016

I thought that was the operation that was proposed by@Mokosha, because std::nth_element guarantees that:

  • the element pointed at by nth is changed to whatever element would occur in that position if [first, last) [aka the whole slice] was sorted.
  • all of the elements before this new nth element are less than or equal to the elements after the new nth element
  • (the two guarantees imply that none of the items after the nth can be greater than the nth)

This is also (not coincidentally ;) ) what quickselect does.

@ticki What is the operation you have in mind?

@Mokosha
Copy link
Author

Mokosha commented Jan 21, 2016

Yes, I'm proposing a full reshuffle of the vector such that all elements after n are greater (for some definition of "greater") than all elements before n.

Also, I wasn't suggesting we use quickselect, just that that's a choice. The interface is more important.

@Mokosha Mokosha changed the title Add quickselect implementation for [T] Add "nth_element" function for [T] Jan 21, 2016
@kamalmarhubi
Copy link
Contributor

Any use cases that suggest why this should be in std rather than crates.io? Most uses in the first few pages of GitHub search results show C++'s std::nth_element being used for online judges / coding contests.

@Mokosha
Copy link
Author

Mokosha commented Jan 22, 2016

Spatial data structures use this functionality a lot when partitioning sets along a coordinate axis. I'm not sure where the line is for usefulness w.r.t. adding it to libstd, though. I was under the impression that crates extending standard library types (especially in such a narrow scope) would be frowned upon?

Mokosha added a commit to Mokosha/pbrt_rust that referenced this issue Jan 23, 2016
This function partially sorts a mutable slice such that the values
in the bottom half of the array are less than the values in the top
half of the array. The definition of "less than" is determined by
the function passed to partition_by.

This function is useful in building BVH trees, and should be replaced
with rust-lang/rfcs#1470 if it matures into the stdlib.
@nrc nrc added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Aug 18, 2016
@IGI-111
Copy link

IGI-111 commented Oct 19, 2018

Any progress on this? I've found myself needing an std::nth_element equivalent to port some FST algorithms.

Regardless of whether it's on topic for libstd, @Mokosha since you seem to have an implementation would you care to put that code in a crate or give someone permission to do so?

@Centril
Copy link
Contributor

Centril commented Oct 19, 2018

@IGI-111 I would suggest opening a PR on rust-lang/rust which implements the functionality in the standard library; please link back to this issue if you do. Once you do that the library team will decide whether to include it or not.

@Mokosha
Copy link
Author

Mokosha commented Oct 19, 2018

You're free to copy/paste my implementation (n/2 is hardcoded as the pivot point), but I'll see if I can open a PR for libstd in the case that people might find it more generally useful.

@codyps
Copy link

codyps commented Oct 25, 2018

FYI: there exist some crates from crates.io for this: kth, pdqselect, order-stat

(just in case others end up at this issue while searching).

@Centril
Copy link
Contributor

Centril commented Oct 25, 2018

Closing this issue since it was opened on the rust issue tracker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests

8 participants