Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-92678: Fix MRO calculation for C extension types with manual __dict__ #95242

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Fix MRO calculation issue which caused C extension types manually managing
their own ``__dict__`` to fail in Python 3.11.
6 changes: 6 additions & 0 deletions Objects/typeobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -2241,6 +2241,12 @@ extra_ivars(PyTypeObject *type, PyTypeObject *base)
type->tp_weaklistoffset + sizeof(PyObject *) == t_size &&
type->tp_flags & Py_TPFLAGS_HEAPTYPE)
t_size -= sizeof(PyObject *);
if (!(type->tp_flags & Py_TPFLAGS_MANAGED_DICT) &&
Copy link
Member

@markshannon markshannon Jul 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is is correct to check Py_TPFLAGS_MANAGED_DICT during class creation?
It is set during class creation: https://github.com/python/cpython/blob/main/Objects/typeobject.c#L2989

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think setting the flag happens before the MRO check. https://github.com/python/cpython/blob/main/Objects/typeobject.c#L3174-L3186
(type_new_descriptors then PyType_Ready ).
(PyType_Ready -> type_ready -> type_ready_mro)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you factor out the check into a macro, the code would be a bit clearer.

#define FIELD_AT_OFFSET(name, offset) type->tp_ ## name && base->tp_## name == 0 ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check for tp_dictoffset and tp_weaklistoffset, and shouldn't presume the order.
So we need to check for weakrefs, then dict, then weakrefs again (or vice versa).

if (FIELD_AT_OFFSET(dictoffset, t_size)) {
   t_size -= sizeof(PyObject *);
}
if (FIELD_AT_OFFSET(weaklistoffset, t_size)) {
   t_size -= sizeof(PyObject *);
}
if (FIELD_AT_OFFSET(dictoffset, t_size)) {
   t_size -= sizeof(PyObject *);
}

type->tp_dictoffset && base->tp_dictoffset == 0 &&
type->tp_dictoffset + sizeof(PyObject *) == t_size &&
type->tp_flags & Py_TPFLAGS_HEAPTYPE) {
t_size -= sizeof(PyObject *);
}
Comment on lines +2244 to +2249
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reintroduces the old code pre-3.11, but guards it with a check for not Py_TPFLAGS_MANAGED_DICT.

8319114#diff-1decebeef15f4e0b0ce106c665751ec55068d4d1d1825847925ad4f528b5b872L2254

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition is a bit arcane. Could you capture it in a variable or use a macro or inline function so we can name what we are checking for? Currently is a bit difficult to decipher what happens when all of these are true or false

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So funny thing is I also found the old code arcane and couldn't really understand it well. From my rough guesstimation

    // The __dict__ isn't automatically managed by CPython.
    if (!(type->tp_flags & Py_TPFLAGS_MANAGED_DICT) &&
        // That a dictoffset is set. Meaning it's a manual __dict__.
        // And the base class doesn't have a __dict__
        type->tp_dictoffset && base->tp_dictoffset == 0 &&
        // Huh?
        type->tp_dictoffset + sizeof(PyObject *) == t_size &&
        // Not a static type.
        type->tp_flags & Py_TPFLAGS_HEAPTYPE) {
        t_size -= sizeof(PyObject *);
    }

Copy link
Member Author

@Fidget-Spinner Fidget-Spinner Jul 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a really cursed example of Hyrum's law.

The extra_ivars code assumes that __weakreflist__ comes first then __dict__. In mypyc it has to follow this order to work pre-3.11. Now in 3.11 when we remove that assumption, all the C extensions using this with complicated MROs break 🤦 .

return t_size != b_size;
}

Expand Down