Skip to content

Commit 2a2bf57

Browse files
nodejs-github-botrichardlau
authored andcommitted
deps: update icu to 74.1
PR-URL: #50515 Backport-PR-URL: #51973 Reviewed-By: Steven R Loomis <[email protected]> Reviewed-By: LiviaMedeiros <[email protected]> Refs: #51933
1 parent fe66e9d commit 2a2bf57

File tree

121 files changed

+9208
-7128
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

121 files changed

+9208
-7128
lines changed

deps/icu-small/LICENSE

+36-43
Original file line numberDiff line numberDiff line change
@@ -1,49 +1,42 @@
1-
UNICODE, INC. LICENSE AGREEMENT - DATA FILES AND SOFTWARE
2-
3-
See Terms of Use <https://www.unicode.org/copyright.html>
4-
for definitions of Unicode Inc.’s Data Files and Software.
5-
6-
NOTICE TO USER: Carefully read the following legal agreement.
7-
BY DOWNLOADING, INSTALLING, COPYING OR OTHERWISE USING UNICODE INC.'S
8-
DATA FILES ("DATA FILES"), AND/OR SOFTWARE ("SOFTWARE"),
9-
YOU UNEQUIVOCALLY ACCEPT, AND AGREE TO BE BOUND BY, ALL OF THE
10-
TERMS AND CONDITIONS OF THIS AGREEMENT.
11-
IF YOU DO NOT AGREE, DO NOT DOWNLOAD, INSTALL, COPY, DISTRIBUTE OR USE
12-
THE DATA FILES OR SOFTWARE.
1+
UNICODE LICENSE V3
132

143
COPYRIGHT AND PERMISSION NOTICE
154

16-
Copyright © 1991-2023 Unicode, Inc. All rights reserved.
17-
Distributed under the Terms of Use in https://www.unicode.org/copyright.html.
18-
19-
Permission is hereby granted, free of charge, to any person obtaining
20-
a copy of the Unicode data files and any associated documentation
21-
(the "Data Files") or Unicode software and any associated documentation
22-
(the "Software") to deal in the Data Files or Software
23-
without restriction, including without limitation the rights to use,
24-
copy, modify, merge, publish, distribute, and/or sell copies of
25-
the Data Files or Software, and to permit persons to whom the Data Files
26-
or Software are furnished to do so, provided that either
27-
(a) this copyright and permission notice appear with all copies
28-
of the Data Files or Software, or
29-
(b) this copyright and permission notice appear in associated
30-
Documentation.
31-
32-
THE DATA FILES AND SOFTWARE ARE PROVIDED "AS IS", WITHOUT WARRANTY OF
33-
ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE
34-
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
35-
NONINFRINGEMENT OF THIRD PARTY RIGHTS.
36-
IN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS
37-
NOTICE BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL
38-
DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE,
39-
DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
40-
TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
41-
PERFORMANCE OF THE DATA FILES OR SOFTWARE.
42-
43-
Except as contained in this notice, the name of a copyright holder
44-
shall not be used in advertising or otherwise to promote the sale,
45-
use or other dealings in these Data Files or Software without prior
46-
written authorization of the copyright holder.
5+
Copyright © 2016-2023 Unicode, Inc.
6+
7+
NOTICE TO USER: Carefully read the following legal agreement. BY
8+
DOWNLOADING, INSTALLING, COPYING OR OTHERWISE USING DATA FILES, AND/OR
9+
SOFTWARE, YOU UNEQUIVOCALLY ACCEPT, AND AGREE TO BE BOUND BY, ALL OF THE
10+
TERMS AND CONDITIONS OF THIS AGREEMENT. IF YOU DO NOT AGREE, DO NOT
11+
DOWNLOAD, INSTALL, COPY, DISTRIBUTE OR USE THE DATA FILES OR SOFTWARE.
12+
13+
Permission is hereby granted, free of charge, to any person obtaining a
14+
copy of data files and any associated documentation (the "Data Files") or
15+
software and any associated documentation (the "Software") to deal in the
16+
Data Files or Software without restriction, including without limitation
17+
the rights to use, copy, modify, merge, publish, distribute, and/or sell
18+
copies of the Data Files or Software, and to permit persons to whom the
19+
Data Files or Software are furnished to do so, provided that either (a)
20+
this copyright and permission notice appear with all copies of the Data
21+
Files or Software, or (b) this copyright and permission notice appear in
22+
associated Documentation.
23+
24+
THE DATA FILES AND SOFTWARE ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
25+
KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
26+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF
27+
THIRD PARTY RIGHTS.
28+
29+
IN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE
30+
BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL DAMAGES,
31+
OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
32+
WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
33+
ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THE DATA
34+
FILES OR SOFTWARE.
35+
36+
Except as contained in this notice, the name of a copyright holder shall
37+
not be used in advertising or otherwise to promote the sale, use or other
38+
dealings in these Data Files or Software without prior written
39+
authorization of the copyright holder.
4740

4841
----------------------------------------------------------------------
4942

deps/icu-small/README-FULL-ICU.txt

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
ICU sources - auto generated by shrink-icu-src.py
22

33
This directory contains the ICU subset used by --with-intl=full-icu
4-
It is a strict subset of ICU 73 source files with the following exception(s):
5-
* deps/icu-small/source/data/in/icudt73l.dat.bz2 : compressed data file
4+
It is a strict subset of ICU 74 source files with the following exception(s):
5+
* deps/icu-small/source/data/in/icudt74l.dat.bz2 : compressed data file
66

77

88
To rebuild this directory, see ../../tools/icu/README.md

deps/icu-small/source/common/BUILD.bazel

+4
Original file line numberDiff line numberDiff line change
@@ -603,12 +603,16 @@ cc_library(
603603
"locbased.cpp",
604604
"locid.cpp",
605605
"loclikely.cpp",
606+
"loclikelysubtags.cpp",
606607
"locmap.cpp",
608+
"lsr.cpp",
607609
"resbund.cpp",
608610
"resource.cpp",
609611
"uloc.cpp",
610612
"uloc_tag.cpp",
611613
"uloc_keytype.cpp",
614+
"ulocale.cpp",
615+
"ulocbuilder.cpp",
612616
"uresbund.cpp",
613617
"uresdata.cpp",
614618
"wintz.cpp",

deps/icu-small/source/common/brkeng.cpp

+93-28
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
#include "unicode/uscript.h"
2222
#include "unicode/ucharstrie.h"
2323
#include "unicode/bytestrie.h"
24+
#include "unicode/rbbi.h"
2425

2526
#include "brkeng.h"
2627
#include "cmemory.h"
@@ -70,19 +71,21 @@ UnhandledEngine::~UnhandledEngine() {
7071
}
7172

7273
UBool
73-
UnhandledEngine::handles(UChar32 c) const {
74+
UnhandledEngine::handles(UChar32 c, const char* locale) const {
75+
(void)locale; // Unused
7476
return fHandled && fHandled->contains(c);
7577
}
7678

7779
int32_t
7880
UnhandledEngine::findBreaks( UText *text,
79-
int32_t /* startPos */,
81+
int32_t startPos,
8082
int32_t endPos,
8183
UVector32 &/*foundBreaks*/,
8284
UBool /* isPhraseBreaking */,
8385
UErrorCode &status) const {
8486
if (U_FAILURE(status)) return 0;
85-
UChar32 c = utext_current32(text);
87+
utext_setNativeIndex(text, startPos);
88+
UChar32 c = utext_current32(text);
8689
while((int32_t)utext_getNativeIndex(text) < endPos && fHandled->contains(c)) {
8790
utext_next32(text); // TODO: recast loop to work with post-increment operations.
8891
c = utext_current32(text);
@@ -120,49 +123,47 @@ ICULanguageBreakFactory::~ICULanguageBreakFactory() {
120123
}
121124
}
122125

123-
U_NAMESPACE_END
124-
U_CDECL_BEGIN
125-
static void U_CALLCONV _deleteEngine(void *obj) {
126-
delete (const icu::LanguageBreakEngine *) obj;
126+
void ICULanguageBreakFactory::ensureEngines(UErrorCode& status) {
127+
static UMutex gBreakEngineMutex;
128+
Mutex m(&gBreakEngineMutex);
129+
if (fEngines == nullptr) {
130+
LocalPointer<UStack> engines(new UStack(uprv_deleteUObject, nullptr, status), status);
131+
if (U_SUCCESS(status)) {
132+
fEngines = engines.orphan();
133+
}
134+
}
127135
}
128-
U_CDECL_END
129-
U_NAMESPACE_BEGIN
130136

131137
const LanguageBreakEngine *
132-
ICULanguageBreakFactory::getEngineFor(UChar32 c) {
138+
ICULanguageBreakFactory::getEngineFor(UChar32 c, const char* locale) {
133139
const LanguageBreakEngine *lbe = nullptr;
134140
UErrorCode status = U_ZERO_ERROR;
141+
ensureEngines(status);
142+
if (U_FAILURE(status) ) {
143+
// Note: no way to return error code to caller.
144+
return nullptr;
145+
}
135146

136147
static UMutex gBreakEngineMutex;
137148
Mutex m(&gBreakEngineMutex);
138-
139-
if (fEngines == nullptr) {
140-
LocalPointer<UStack> engines(new UStack(_deleteEngine, nullptr, status), status);
141-
if (U_FAILURE(status) ) {
142-
// Note: no way to return error code to caller.
143-
return nullptr;
144-
}
145-
fEngines = engines.orphan();
146-
} else {
147-
int32_t i = fEngines->size();
148-
while (--i >= 0) {
149-
lbe = (const LanguageBreakEngine *)(fEngines->elementAt(i));
150-
if (lbe != nullptr && lbe->handles(c)) {
151-
return lbe;
152-
}
149+
int32_t i = fEngines->size();
150+
while (--i >= 0) {
151+
lbe = (const LanguageBreakEngine *)(fEngines->elementAt(i));
152+
if (lbe != nullptr && lbe->handles(c, locale)) {
153+
return lbe;
153154
}
154155
}
155-
156+
156157
// We didn't find an engine. Create one.
157-
lbe = loadEngineFor(c);
158+
lbe = loadEngineFor(c, locale);
158159
if (lbe != nullptr) {
159160
fEngines->push((void *)lbe, status);
160161
}
161162
return U_SUCCESS(status) ? lbe : nullptr;
162163
}
163164

164165
const LanguageBreakEngine *
165-
ICULanguageBreakFactory::loadEngineFor(UChar32 c) {
166+
ICULanguageBreakFactory::loadEngineFor(UChar32 c, const char*) {
166167
UErrorCode status = U_ZERO_ERROR;
167168
UScriptCode code = uscript_getScript(c, &status);
168169
if (U_SUCCESS(status)) {
@@ -299,6 +300,70 @@ ICULanguageBreakFactory::loadDictionaryMatcherFor(UScriptCode script) {
299300
return nullptr;
300301
}
301302

303+
304+
void ICULanguageBreakFactory::addExternalEngine(
305+
ExternalBreakEngine* external, UErrorCode& status) {
306+
LocalPointer<ExternalBreakEngine> engine(external, status);
307+
ensureEngines(status);
308+
LocalPointer<BreakEngineWrapper> wrapper(
309+
new BreakEngineWrapper(engine.orphan(), status), status);
310+
static UMutex gBreakEngineMutex;
311+
Mutex m(&gBreakEngineMutex);
312+
fEngines->push(wrapper.getAlias(), status);
313+
wrapper.orphan();
314+
}
315+
316+
BreakEngineWrapper::BreakEngineWrapper(
317+
ExternalBreakEngine* engine, UErrorCode &status) : delegate(engine, status) {
318+
}
319+
320+
BreakEngineWrapper::~BreakEngineWrapper() {
321+
}
322+
323+
UBool BreakEngineWrapper::handles(UChar32 c, const char* locale) const {
324+
return delegate->isFor(c, locale);
325+
}
326+
327+
int32_t BreakEngineWrapper::findBreaks(
328+
UText *text,
329+
int32_t startPos,
330+
int32_t endPos,
331+
UVector32 &foundBreaks,
332+
UBool /* isPhraseBreaking */,
333+
UErrorCode &status) const {
334+
if (U_FAILURE(status)) return 0;
335+
int32_t result = 0;
336+
337+
// Find the span of characters included in the set.
338+
// The span to break begins at the current position in the text, and
339+
// extends towards the start or end of the text, depending on 'reverse'.
340+
341+
utext_setNativeIndex(text, startPos);
342+
int32_t start = (int32_t)utext_getNativeIndex(text);
343+
int32_t current;
344+
int32_t rangeStart;
345+
int32_t rangeEnd;
346+
UChar32 c = utext_current32(text);
347+
while((current = (int32_t)utext_getNativeIndex(text)) < endPos && delegate->handles(c)) {
348+
utext_next32(text); // TODO: recast loop for postincrement
349+
c = utext_current32(text);
350+
}
351+
rangeStart = start;
352+
rangeEnd = current;
353+
int32_t beforeSize = foundBreaks.size();
354+
int32_t additionalCapacity = rangeEnd - rangeStart + 1;
355+
// enlarge to contains (rangeEnd-rangeStart+1) more items
356+
foundBreaks.ensureCapacity(beforeSize+additionalCapacity, status);
357+
if (U_FAILURE(status)) return 0;
358+
foundBreaks.setSize(beforeSize + beforeSize+additionalCapacity);
359+
result = delegate->fillBreaks(text, rangeStart, rangeEnd, foundBreaks.getBuffer()+beforeSize,
360+
additionalCapacity, status);
361+
if (U_FAILURE(status)) return 0;
362+
foundBreaks.setSize(beforeSize + result);
363+
utext_setNativeIndex(text, current);
364+
return result;
365+
}
366+
302367
U_NAMESPACE_END
303368

304369
#endif /* #if !UCONFIG_NO_BREAK_ITERATION */

0 commit comments

Comments
 (0)