-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BufRead::read_while #70772
Add BufRead::read_while #70772
Conversation
r? @sfackler (rust_highfive has picked a reviewer for you, use r? to override) |
I am confused why CI is failing. Perhaps someone with more knowledge on Rust's CI could point me in the right direction? |
#69898 will hopefully unblock the CI, right now mingw-check fails on outdated toolstate script that doesn't take renamed rustc-guide into account properly |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
/// assert_eq!(buf, b"-ipsum"); | ||
/// ``` | ||
#[unstable(feature = "buf_read_read_while", issue = "none")] | ||
fn read_while<P>(&mut self, buf: &mut Vec<u8>, predicate: P) -> Result<usize> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth updating read_until to delegate to this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read_until
uses memchr
to scan for the token; it seems likely this operation is vectorized in many cases, and I'm afraid moving to read_while
's linear scan could inadvertently cause a slowdown.
I'm a bit worried that BufRead will have both read_until and read_while with different arguments but the naming doesn't make it super clear which is which. |
@sfackler That's a fair concern. My reasoning for naming this method I think I think you're right that the naming of these methods could confuse people who aren't already familiar with Rust. However both methods seem useful and reasonably named. I think the best resolution here would be to acknowledge that both methods indeed have a close resemblance, and document how they differ. For example we could add something like following to
|
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
I don't quite follow the motivation. As for the second paragraph I don't see the problem of taking |
@HeroicKatora Ah yes, fair enough. I didn't realize that this was implemented for
Perhaps adding more details could be helpful here. A while ago I was implementing a parser and encoder for the MIME spec and figured we could abstract it to take any
This means: "read until you encounter either // Get the param name.
//
// ```txt
// text/html; charset=utf-8;
// ^^^^^^^
// ```
let mut param_name = vec![];
s.read_while(&mut param_name, |b| !matches!(b, b';' | b'='))?; Neither |
It seems clear that the existing methods are insufficient, adding this convenience would be good complement. Perhaps this was not very clear but I wasn't trying to questioning the need per-se, but the need of that solution in particular. Which alternatives were considered and why was this one chosen? The adaptor solution that I was trying to hint at would look like: let mut param_name = vec![];
s
.by_ref()
.take_while(|b| !matches!(b, b';' | b'='))
.read_to_end(&mut param_name)?; which parallels |
@HeroicKatora Oh I really like that suggestion! That's a design I hadn't considered, and I'll be the first to admit that it ticks all the same boxes as what I had, but with fewer downsides (as @sfackler mentioned: argument ordering of I think the right next step is to close this PR and file an issue to implement |
This patch implements
BufRead::read_while
, a counterpart toBufRead::read_until
that takes a predicate instead of a single byte. This allows delimiting byte streams using more refined patterns, which in particularly works well with the newly stabilizedmatches
macro.Usage
Motivation
The
Read
andBufRead
traits provide far fewer adapters thanIterator
does. The idea is that in most cases it should be possible to convert aRead
into aBufRead
, and subsequently calllines
orsplit
to convert it into anIterator
.However both
BufRead::split
andBufRead::lines
consumeself
which is problematic for streams that want to parse only a section of a stream but leave the rest of the stream intact. In such cases it can be incredibly convenient to be able to directly operate on the stream (a common example are "headers" sections). And adapters such asBufRead::read_while
make such operations faster to author and less prone to errors.Differences with BufRead::read_until
BufRead::read_until
provides an inclusive read,BufRead::read_while
provides a non-inclusive read. When thepredicate
no longer returnstrue
, the operation finishes and the buffer becomes available with all bytes read up until that point.References
BufRead::read_until
Iterator::take_while
omnom::BufReadExt::read_while