Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow predicate functions to lstrip etc. and default to stripping unicode whitespace #27309

Merged
merged 1 commit into from
Jun 15, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion base/shell.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ function rstrip_shell(s::AbstractString)
c_old = nothing
for (i, c) in Iterators.reverse(pairs(s))
((c == '\\') && c_old == ' ') && return SubString(s, 1, i+1)
c in _default_delims || return SubString(s, 1, i)
isspace(c) || return SubString(s, 1, i)
c_old = c
end
SubString(s, 1, 0)
Expand Down
54 changes: 31 additions & 23 deletions base/strings/util.jl
Original file line number Diff line number Diff line change
Expand Up @@ -134,16 +134,16 @@ function chomp(s::String)
end
end

const _default_delims = [' ','\t','\n','\v','\f','\r']

"""
lstrip(s::AbstractString[, chars::Chars])
lstrip(str::AbstractString[, chars])

Remove leading characters from `str`.

Return `s` with any leading whitespace and delimiters removed.
The default delimiters to remove are `' '`, `\\t`, `\\n`, `\\v`,
`\\f`, and `\\r`.
If `chars` (a character, or vector or set of characters) is provided,
instead remove characters contained in it.
The default behaviour is to remove leading whitespace and delimiters: see
[`isspace`](@ref) for precise details.

The optional `chars` argument specifies which characters to remove: it can be a single character,
vector or set of characters, or a predicate function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The predicate function should always go first per our conventions, documented in the style guide in the manual.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a "general rule" not a "must". Having the function argument second here seems much more natural to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guidelines

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree wholeheartedly; it seems very unnatural to me.


# Examples
```jldoctest
Expand All @@ -154,22 +154,25 @@ julia> lstrip(a)
"March"
```
"""
function lstrip(s::AbstractString, chars::Chars=_default_delims)
function lstrip(s::AbstractString, f=isspace)
e = lastindex(s)
for (i, c) in pairs(s)
!(c in chars) && return SubString(s, i, e)
!f(c) && return SubString(s, i, e)
end
SubString(s, e+1, e)
end
lstrip(s::AbstractString, chars::Chars) = lstrip(s, in(chars))

"""
rstrip(s::AbstractString[, chars::Chars])
rstrip(str::AbstractString[, chars])

Remove trailing characters from `str`.

Return `s` with any trailing whitespace and delimiters removed.
The default delimiters to remove are `' '`, `\\t`, `\\n`, `\\v`,
`\\f`, and `\\r`.
If `chars` (a character, or vector or set of characters) is provided,
instead remove characters contained in it.
The default behaviour is to remove leading whitespace and delimiters: see
[`isspace`](@ref) for precise details.

The optional `chars` argument specifies which characters to remove: it can be a single character,
vector or set of characters, or a predicate function.

# Examples
```jldoctest
Expand All @@ -180,19 +183,24 @@ julia> rstrip(a)
"March"
```
"""
function rstrip(s::AbstractString, chars::Chars=_default_delims)
function rstrip(s::AbstractString, f=isspace)
for (i, c) in Iterators.reverse(pairs(s))
c in chars || return SubString(s, 1, i)
f(c) || return SubString(s, 1, i)
end
SubString(s, 1, 0)
end
rstrip(s::AbstractString, chars::Chars) = rstrip(s, in(chars))

"""
strip(s::AbstractString, [chars::Chars])
strip(str::AbstractString, [chars])

Remove leading and trailing characters from `str`.

The default behaviour is to remove leading whitespace and delimiters: see
[`isspace`](@ref) for precise details.

Return `s` with any leading and trailing whitespace removed.
If `chars` (a character, or vector or set of characters) is provided,
instead remove characters contained in it.
The optional `chars` argument specifies which characters to remove: it can be a single character,
vector or set of characters, or a predicate function.

# Examples
```jldoctest
Expand All @@ -201,7 +209,7 @@ julia> strip("{3, 5}\\n", ['{', '}', '\\n'])
```
"""
strip(s::AbstractString) = lstrip(rstrip(s))
strip(s::AbstractString, chars::Chars) = lstrip(rstrip(s, chars), chars)
strip(s::AbstractString, chars) = lstrip(rstrip(s, chars), chars)

## string padding functions ##

Expand Down
2 changes: 1 addition & 1 deletion stdlib/DelimitedFiles/src/DelimitedFiles.jl
Original file line number Diff line number Diff line change
Expand Up @@ -590,7 +590,7 @@ function dlm_parse(dbuff::String, eol::D, dlm::D, qchar::D, cchar::D,
val,idx = iterate(dbuff, idx)
if (is_eol = (Char(val) == Char(eol)))
is_dlm = is_comment = is_cr = is_quote = false
elseif (is_dlm = (is_default_dlm ? in(Char(val), _default_delims) : (Char(val) == Char(dlm))))
elseif (is_dlm = (is_default_dlm ? isspace(Char(val)) : (Char(val) == Char(dlm))))
is_comment = is_cr = is_quote = false
elseif (is_quote = (Char(val) == Char(qchar)))
is_comment = is_cr = false
Expand Down
1 change: 1 addition & 0 deletions test/strings/util.jl
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ end
@test strip(" ") == ""
@test strip(" ") == ""
@test strip("\t hi \n") == "hi"
@test strip(" \u2009 hi \u2009 ") == "hi"
@test strip("foobarfoo", ['f','o']) == "bar"
@test strip("foobarfoo", ('f','o')) == "bar"

Expand Down