Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) #123065

Merged
merged 14 commits into from
Sep 27, 2024

Conversation

encukou
Copy link
Member

@encukou encukou commented Aug 16, 2024

This backports several PRs for gh-113993, making interned strings mortal so they can be garbage-collected when no longer needed.

  • Allow interned strings to be mortal, and fix related issues (gh-113993: Allow interned strings to be mortal, and fix related issues #120520)

    • Add an InternalDocs file describing how interning should work and how to use it.

    • Add internal functions to explicitly request what kind of interning is done:

      • _PyUnicode_InternMortal
      • _PyUnicode_InternImmortal
      • _PyUnicode_InternStatic
    • Switch uses of PyUnicode_InternInPlace to those.

    • Disallow using _Py_SetImmortal on strings directly.
      You should use _PyUnicode_InternImmortal instead:

      • Strings should be interned before immortalization, otherwise you're possibly
        interning a immortalizing copy.
      • _Py_SetImmortal doesn't handle the SSTATE_INTERNED_MORTAL to
        SSTATE_INTERNED_IMMORTAL update, and those flags can't be changed in
        backports, as they are now part of public API and version-specific ABI.
    • Add private _only_immortal argument for sys.getunicodeinternedsize, used in refleak test machinery.

    Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:

    • _Py_ID
    • _Py_STR (including the empty string)
    • one-character latin-1 singletons

    Now, when you intern a singleton, that exact singleton will be interned.

    • Add a _Py_LATIN1_CHR macro, use it instead of _Py_ID/_Py_STR for one-character latin-1 singletons everywhere (including Clinic).

    • Intern _Py_STR singletons at startup.

    • Beef up the tests. Cover internal details (marked with @cpython_only).

    • Add lots of assertions

  • Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API (gh-113993: Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API #121364)

    • Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs

    • Document immortality in some functions that take const char *

    This is PyUnicode_InternFromString;
    PyDict_SetItemString, PyObject_SetAttrString;
    PyObject_DelAttrString; PyUnicode_InternFromString;
    and the PyModule_Add convenience functions.

    Always point out a non-immortalizing alternative.

    • Don't immortalize user-provided attr names in _ctypes
  • Immortalize names in code objects to avoid crash (gh-121863: Immortalize names in code objects to avoid crash #121903)

  • Intern latin-1 one-byte strings at startup (gh-122291: Intern latin-1 one-byte strings at startup #122303)

There are some 3.12-specific changes, mainly to allow statically allocated strings in deepfreeze. (In 3.13, deepfreeze switched to the general _Py_ID/_Py_STR.)

Co-authored-by: Eric Snow [email protected]


📚 Documentation preview 📚: https://cpython-previews--123065.org.readthedocs.build/

Issue: #113993

encukou and others added 7 commits August 16, 2024 13:36
…related issues (pythonGH-120520)

* Add an InternalDocs file describing how interning should work and how to use it.

* Add internal functions to *explicitly* request what kind of interning is done:
  - `_PyUnicode_InternMortal`
  - `_PyUnicode_InternImmortal`
  - `_PyUnicode_InternStatic`

* Switch uses of `PyUnicode_InternInPlace` to those.

* Disallow using `_Py_SetImmortal` on strings directly.
  You should use `_PyUnicode_InternImmortal` instead:
  - Strings should be interned before immortalization, otherwise you're possibly
    interning a immortalizing copy.
  - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
    `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
    backports, as they are now part of public API and version-specific ABI.

* Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.

* Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
  - `_Py_ID`
  - `_Py_STR` (including the empty string)
  - one-character latin-1 singletons

  Now, when you intern a singleton, that exact singleton will be interned.

* Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).

* Intern `_Py_STR` singletons at startup.

* Beef up the tests. Cover internal details (marked with `@cpython_only`).

* Add lots of assertions

Co-authored-by: Eric Snow <[email protected]>
… keep immortalizing in other API (pythonGH-121364)

* Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs

* Document immortality in some functions that take `const char *`

This is PyUnicode_InternFromString;
PyDict_SetItemString, PyObject_SetAttrString;
PyObject_DelAttrString; PyUnicode_InternFromString;
and the PyModule_Add convenience functions.

Always point out a non-immortalizing alternative.

* Don't immortalize user-provided attr names in _ctypes
(cherry picked from commit b4aedb2)
@encukou encukou added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Aug 21, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @encukou for commit 2640dc8 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Aug 21, 2024
@encukou
Copy link
Member Author

encukou commented Aug 22, 2024

The buildbot failures look unrelated.

@encukou
Copy link
Member Author

encukou commented Sep 23, 2024

@Yhg1s, please review this backport. Sorry about the size!

@Yhg1s Yhg1s merged commit 49f6beb into python:3.12 Sep 27, 2024
29 checks passed
@encukou encukou deleted the mortal-interns-3.12 branch September 27, 2024 23:31
@mgorny
Copy link
Contributor

mgorny commented Oct 2, 2024

FYI, this is at least missing a backport of 281ffb6, and therefore causing Rust packages to crash on assertions. I've filed #124887.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants