Skip to content

Commit 86970a1

Browse files
authored
Merge pull request #44 from Jules-Bertholet/dont-be-shy
Remove soft hyphen special case
2 parents 3aa94a5 + 05ee35d commit 86970a1

File tree

5 files changed

+13
-18
lines changed

5 files changed

+13
-18
lines changed

README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44
[![crates.io version](https://img.shields.io/crates/v/unicode-width)](https://crates.io/crates/unicode-width)
55
[![Docs status](https://img.shields.io/docsrs/unicode-width)](https://docs.rs/unicode-width/)
66

7-
Determine displayed width of `char` and `str` types according to [Unicode Standard Annex #11][UAX11],
8-
other portions of the Unicode standard, and common implementations of POSIX [`wcwidth()`](https://pubs.opengroup.org/onlinepubs/9699919799/).
7+
Determine displayed width of `char` and `str` types according to [Unicode Standard Annex #11][UAX11]
8+
and other portions of the Unicode standard.
99

1010
This crate is `#![no_std]`.
1111

scripts/unicode.py

-3
Original file line numberDiff line numberDiff line change
@@ -776,9 +776,6 @@ def main(module_path: str):
776776
map(lambda x: EffectiveWidth.ZERO if x[1] else x[0], zip(eaw_map, zw_map))
777777
)
778778

779-
# Override for soft hyphen
780-
width_map[0x00AD] = EffectiveWidth.NARROW
781-
782779
tables = make_tables(TABLE_CFGS, enumerate(width_map))
783780

784781
emoji_presentations = load_emoji_presentation_sequences()

src/lib.rs

+8-10
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,8 @@
99
// except according to those terms.
1010

1111
//! Determine displayed width of `char` and `str` types according to
12-
//! [Unicode Standard Annex #11](http://www.unicode.org/reports/tr11/),
13-
//! other portions of the Unicode standard, and common implementations of
14-
//! POSIX [`wcwidth()`](https://pubs.opengroup.org/onlinepubs/9699919799/).
12+
//! [Unicode Standard Annex #11](http://www.unicode.org/reports/tr11/)
13+
//! and other portions of the Unicode standard.
1514
//! See the [Rules for determining width](#rules-for-determining-width) section
1615
//! for the exact rules.
1716
//!
@@ -39,9 +38,8 @@
3938
//! iff their base character fulfills all the following requirements:
4039
//! - Has the [`Emoji_Presentation`] property, and
4140
//! - Not in the [Enclosed Ideographic Supplement] block.
42-
//! 3. [`'\u{00AD}'` SOFT HYPHEN](https://util.unicode.org/UnicodeJsps/character.jsp?a=00AD) has width 1.
43-
//! 4. [`'\u{115F}'` HANGUL CHOSEONG FILLER](https://util.unicode.org/UnicodeJsps/character.jsp?a=115F) has width 2.
44-
//! 5. The following have width 0:
41+
//! 3. [`'\u{115F}'` HANGUL CHOSEONG FILLER](https://util.unicode.org/UnicodeJsps/character.jsp?a=115F) has width 2.
42+
//! 4. The following have width 0:
4543
//! - [Characters](https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BDefault_Ignorable_Code_Point%7D)
4644
//! with the [`Default_Ignorable_Code_Point`](https://www.unicode.org/versions/Unicode15.0.0/ch05.pdf#G40095) property.
4745
//! - [Characters](https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BGrapheme_Extend%7D)
@@ -58,13 +56,13 @@
5856
//! - [Characters](https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BHangul_Syllable_Type%3DV%7D%5Cp%7BHangul_Syllable_Type%3DT%7D)
5957
//! with a [`Hangul_Syllable_Type`] of `Vowel_Jamo` (`V`) or `Trailing_Jamo` (`T`).
6058
//! - [`'\0'` NUL](https://util.unicode.org/UnicodeJsps/character.jsp?a=0000).
61-
//! 6. The [control characters](https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BCc%7D)
59+
//! 5. The [control characters](https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BCc%7D)
6260
//! have no defined width, and are ignored when determining the width of a string.
63-
//! 7. [Characters](https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BEast_Asian_Width%3DF%7D%5Cp%7BEast_Asian_Width%3DW%7D)
61+
//! 6. [Characters](https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BEast_Asian_Width%3DF%7D%5Cp%7BEast_Asian_Width%3DW%7D)
6462
//! with an [`East_Asian_Width`] of [`Fullwidth`] or [`Wide`] have width 2.
65-
//! 8. [Characters](https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BEast_Asian_Width%3DA%7D)
63+
//! 7. [Characters](https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BEast_Asian_Width%3DA%7D)
6664
//! with an [`East_Asian_Width`] of [`Ambiguous`] have width 2 in an East Asian context, and width 1 otherwise.
67-
//! 9. All other characters have width 1.
65+
//! 8. All other characters have width 1.
6866
//!
6967
//! [`East_Asian_Width`]: https://www.unicode.org/reports/tr11/#ED1
7068
//! [`Emoji_Presentation`]: https://unicode.org/reports/tr51/#def_emoji_presentation

src/tables.rs

+1-1
Original file line numberDiff line numberDiff line change
@@ -330,7 +330,7 @@ pub mod charwidth {
330330
static TABLES_2: [u8; 3936] = [
331331
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55,
332332
0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55,
333-
0x55, 0x15, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x5D, 0xD7, 0x77, 0x75, 0xFF,
333+
0x55, 0x15, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x5D, 0xD7, 0x77, 0x71, 0xFF,
334334
0xF7, 0x7F, 0xFF, 0x55, 0x75, 0x55, 0x55, 0x57, 0xD5, 0x57, 0xF5, 0x5F, 0x75, 0x7F, 0x5F,
335335
0xF7, 0xD5, 0x7F, 0x77, 0x5D, 0x55, 0x55, 0x55, 0xDD, 0x55, 0xD5, 0x55, 0x55, 0xF5, 0xD5,
336336
0x55, 0xFD, 0x55, 0x57, 0xD5, 0x7F, 0x57, 0xFF, 0x5D, 0xF5, 0x55, 0x55, 0x55, 0x55, 0xF5,

tests/tests.rs

+2-2
Original file line numberDiff line numberDiff line change
@@ -64,8 +64,8 @@ fn test_char2() {
6464
assert_eq!(UnicodeWidthChar::width('h'), Some(2));
6565
assert_eq!('h'.width_cjk(), Some(2));
6666

67-
assert_eq!(UnicodeWidthChar::width('\u{AD}'), Some(1));
68-
assert_eq!('\u{AD}'.width_cjk(), Some(1));
67+
assert_eq!(UnicodeWidthChar::width('\u{AD}'), Some(0));
68+
assert_eq!('\u{AD}'.width_cjk(), Some(0));
6969

7070
assert_eq!(UnicodeWidthChar::width('\u{1160}'), Some(0));
7171
assert_eq!('\u{1160}'.width_cjk(), Some(0));

0 commit comments

Comments
 (0)