Skip to content

Commit 402f9ed

Browse files
committed
fixes from feedback
1 parent 07c665d commit 402f9ed

File tree

4 files changed

+9
-8
lines changed

4 files changed

+9
-8
lines changed

base/char.jl

+3-3
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,8 @@ representable in a given `AbstractChar` type.
1818
Internally, an `AbstractChar` type may use a variety of encodings. Conversion
1919
to `UInt32` will not reveal this encoding because it always returns the
2020
Unicode value of the character. (Typically, the raw encoding can be obtained
21-
via [`reinterpret`](@ref).)
21+
via [`reinterpret`](@ref).) Character I/O uses UTF-8 by default for all
22+
character types, regardless of their internal encoding.
2223
"""
2324
AbstractChar
2425

@@ -148,8 +149,7 @@ hash(x::Char, h::UInt) =
148149
# fallbacks:
149150
isless(x::AbstractChar, y::AbstractChar) = isless(Char(x), Char(y))
150151
==(x::AbstractChar, y::AbstractChar) = Char(x) == Char(y)
151-
hash(x::AbstractChar, h::UInt) =
152-
hash_uint64(((UInt32(x) + UInt64(0xd060fad0)) << 32) UInt64(h))
152+
hash(x::AbstractChar, h::UInt) = hash(Char(x), h)
153153
widen(::Type{T}) where {T<:AbstractChar} = T
154154

155155
-(x::AbstractChar, y::AbstractChar) = Int(x) - Int(y)

base/strings/basic.jl

+2-2
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,8 @@ about strings:
1414
* String indexing is done in terms of these code units:
1515
* Characters are extracted by `s[i]` with a valid string index `i`
1616
* Each `AbstractChar` in a string is encoded by one or more code units
17-
* Only the index of the first code unit of a `AbstractChar` is a valid index
18-
* The encoding of a `AbstractChar` is independent of what precedes or follows it
17+
* Only the index of the first code unit of an `AbstractChar` is a valid index
18+
* The encoding of an `AbstractChar` is independent of what precedes or follows it
1919
* String encodings are [self-synchronizing] – i.e. `isvalid(s, i)` is O(1)
2020
2121
[self-synchronizing]: https://en.wikipedia.org/wiki/Self-synchronizing_code

base/strings/util.jl

+1-1
Original file line numberDiff line numberDiff line change
@@ -410,7 +410,7 @@ If `count` is provided, replace at most `count` occurrences.
410410
or a regular expression.
411411
If `r` is a function, each occurrence is replaced with `r(s)`
412412
where `s` is the matched substring (when `pat`is a `Regex` or `AbstractString`) or
413-
character (when `pat` is a `AbstractChar` or a collection of `AbstractChar`).
413+
character (when `pat` is an `AbstractChar` or a collection of `AbstractChar`).
414414
If `pat` is a regular expression and `r` is a `SubstitutionString`, then capture group
415415
references in `r` are replaced with the corresponding matched text.
416416
To remove instances of `pat` from `string`, set `r` to the empty `String` (`""`).

doc/src/manual/strings.md

+3-2
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,9 @@ There are a few noteworthy high-level features about Julia's strings:
2828
additional `AbstractString` subtypes (e.g. for other encodings). If you define a function expecting
2929
a string argument, you should declare the type as `AbstractString` in order to accept any string
3030
type.
31-
* Like C and Java, but unlike most dynamic languages, Julia has a first-class type representing
32-
a single character, called `AbstractChar`. This is just a special kind of 32-bit primitive type whose numeric value represents a Unicode code point.
31+
* Like C and Java, but unlike most dynamic languages, Julia has a first-class type for representing
32+
a single character, called `AbstractChar`. The built-in `Char` subtype of `AbstractChar`
33+
is a 32-bit primitive type that can represent any Unicode character.
3334
* As in Java, strings are immutable: the value of an `AbstractString` object cannot be changed.
3435
To construct a different string value, you construct a new string from parts of other strings.
3536
* Conceptually, a string is a *partial function* from indices to characters: for some index values,

0 commit comments

Comments
 (0)