-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) #123065
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…related issues (pythonGH-120520) * Add an InternalDocs file describing how interning should work and how to use it. * Add internal functions to *explicitly* request what kind of interning is done: - `_PyUnicode_InternMortal` - `_PyUnicode_InternImmortal` - `_PyUnicode_InternStatic` * Switch uses of `PyUnicode_InternInPlace` to those. * Disallow using `_Py_SetImmortal` on strings directly. You should use `_PyUnicode_InternImmortal` instead: - Strings should be interned before immortalization, otherwise you're possibly interning a immortalizing copy. - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in backports, as they are now part of public API and version-specific ABI. * Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery. * Make sure the statically allocated string singletons are unique. This means these sets are now disjoint: - `_Py_ID` - `_Py_STR` (including the empty string) - one-character latin-1 singletons Now, when you intern a singleton, that exact singleton will be interned. * Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic). * Intern `_Py_STR` singletons at startup. * Beef up the tests. Cover internal details (marked with `@cpython_only`). * Add lots of assertions Co-authored-by: Eric Snow <[email protected]>
… keep immortalizing in other API (pythonGH-121364) * Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs * Document immortality in some functions that take `const char *` This is PyUnicode_InternFromString; PyDict_SetItemString, PyObject_SetAttrString; PyObject_DelAttrString; PyUnicode_InternFromString; and the PyModule_Add convenience functions. Always point out a non-immortalizing alternative. * Don't immortalize user-provided attr names in _ctypes (cherry picked from commit b4aedb2)
The buildbot failures look unrelated. |
@Yhg1s, please review this backport. Sorry about the size! |
…n#124464) _PyID does not exist but _Py_ID does.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This backports several PRs for gh-113993, making interned strings mortal so they can be garbage-collected when no longer needed.
Allow interned strings to be mortal, and fix related issues (gh-113993: Allow interned strings to be mortal, and fix related issues #120520)
Add an InternalDocs file describing how interning should work and how to use it.
Add internal functions to explicitly request what kind of interning is done:
_PyUnicode_InternMortal
_PyUnicode_InternImmortal
_PyUnicode_InternStatic
Switch uses of
PyUnicode_InternInPlace
to those.Disallow using
_Py_SetImmortal
on strings directly.You should use
_PyUnicode_InternImmortal
instead:interning a immortalizing copy.
_Py_SetImmortal
doesn't handle theSSTATE_INTERNED_MORTAL
toSSTATE_INTERNED_IMMORTAL
update, and those flags can't be changed inbackports, as they are now part of public API and version-specific ABI.
Add private
_only_immortal
argument forsys.getunicodeinternedsize
, used in refleak test machinery.Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
_Py_ID
_Py_STR
(including the empty string)Now, when you intern a singleton, that exact singleton will be interned.
Add a
_Py_LATIN1_CHR
macro, use it instead of_Py_ID
/_Py_STR
for one-character latin-1 singletons everywhere (including Clinic).Intern
_Py_STR
singletons at startup.Beef up the tests. Cover internal details (marked with
@cpython_only
).Add lots of assertions
Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API (gh-113993: Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API #121364)
Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs
Document immortality in some functions that take
const char *
This is PyUnicode_InternFromString;
PyDict_SetItemString, PyObject_SetAttrString;
PyObject_DelAttrString; PyUnicode_InternFromString;
and the PyModule_Add convenience functions.
Always point out a non-immortalizing alternative.
Immortalize names in code objects to avoid crash (gh-121863: Immortalize names in code objects to avoid crash #121903)
Intern latin-1 one-byte strings at startup (gh-122291: Intern latin-1 one-byte strings at startup #122303)
There are some 3.12-specific changes, mainly to allow statically allocated strings in deepfreeze. (In 3.13, deepfreeze switched to the general
_Py_ID
/_Py_STR
.)Co-authored-by: Eric Snow [email protected]
📚 Documentation preview 📚: https://cpython-previews--123065.org.readthedocs.build/
Issue: #113993