Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regular object layout #80

Closed
markshannon opened this issue Aug 4, 2021 · 12 comments
Closed

Regular object layout #80

markshannon opened this issue Aug 4, 2021 · 12 comments

Comments

@markshannon
Copy link
Member

Accessing the __dict__ of an object takes a bit of pointer chasing and computation.
The code to get the address of the __dict__ is as follows:
(assuming we have already checked whether this object can have a dict)

    Py_ssize_t dictoffset = tp->tp_dictoffset;
    if (dictoffset < 0) {
        /* Compute size of object */
        Py_ssize_t tsize = ...
        dictoffset += size;
    }
    dictptr = (PyObject **) ((char *)obj + dictoffset);

Given how often we access instance attributes, this overhead is significant

What we would like is:

    dictptr = (PyObject **) ((char *)obj + CONSTANT;
@markshannon
Copy link
Member Author

Since not all objects have a __dict__ we cannot simply put the dictionary at a fixed offset after the header without leaving holes in small objects like ints and floats.

Since any object that has a dictionary can be part of a cycle, it must have a GC header.
Therefore, we can put the dictionary directly before the GC header and there will be no gap.

Simple object without GC header, e.g. an int.

minimal

Object that may be part of cycle, but without __dict__, e.g. a list.

cyclic

Object with a __dict__

popo

@methane
Copy link

methane commented Aug 10, 2021

pymalloc aligns memory blocks with 2words (8byte on 32bit, 16byte on 64bit platform).

Currently, GC header is 2 words so no gap. If we add __dict__ there, we need to add a gap.

@markshannon
Copy link
Member Author

If we were to reduce the GC header to a single word, then placing the dict pointer before the header would be even more compelling as it would fill the gap.

@markshannon
Copy link
Member Author

Given that allocations are two-word aligned, the pre-header needs to be an even number of words.

Long term we want this layout:

popo_compact

But in the medium term, this gives us fixed offsets for dict and weakref pointers and is as compact as what we have now:

popo_with_weakrefs

@pxeger
Copy link

pxeger commented Sep 22, 2021

Maybe a noobish and/or off-topic question because I don't know much about CPython memory layout, but why is there anything before what the object pointer points to at all? Why not just keep it all after that?

@markshannon
Copy link
Member Author

It needs to be at a fixed offset and allow for variable sized objects and inheritance.

@gvanrossum
Copy link
Collaborator

In particular, for ABI compatibility we need to keep ob_refcnt and ob_type at the same offset relative to the pointer.

@markshannon
Copy link
Member Author

An alternative to the above, which will work well with #72 (comment) is:
popo_with_values

@markshannon
Copy link
Member Author

Regular object layout would also help the GC traverse and clean objects, as well as simplify code for inheritance of layouts, assigning __dict__ and __class__ attributes.
To that end we also need to consider the layout of an object after the class pointer.

I'm going to gloss over weak references here. If we inline the values, the weakref list can go where the values pointer is now.
In the future, we may maintain the weakrefs to an object externally, both saving memory and allowing weakrefs to any object.

The object can be broken down into five sections:

  1. dict and values pointers
  2. GC bits
  3. BaseObject (class and reference count)
  4. Custom section
  5. Slots

The first three are described above. The custom section is whatever is handled by the custom code for a builtin class. Slots are slots described either by __slots__ in Python code, or in the type spec.

E.g. the following class

class XList(list):
    __slots__ = "a", "b", "__dict__"

would have all five sections.

In contrast, object has just the one section: BaseObject.

Objects without a custom section, are transparent to the GC and VM and will need no custom traverse, etc functions.
For efficiency we might want to insist that object slots are grouped together and precede non-object slots.

Inheritance

The rules for layout inheritance are much as they are now, but slightly simplifies by having the dictionary at a fixed offset.
Single inheritance in Python is always legal. New slots are added at the end.

Multiple inheritance where more than one class has a custom section is prohibited. I think this is the same as it is now: "multiple bases have instance lay-out conflict".

GC operations

The GC needs to traverse objects and clear them. In addition, objects need to be deallocated.
These operations are all very similar to each other, and are also similar across different classes. Yet we have a plethora of different, often buggy, implementations.

For objects without a custom section (and perhaps for some special cases with a custom section) the above layout is transparent to the GC. The traversal and deallocation functions can be inlined leading to faster and more robust memory management.

@gvanrossum
Copy link
Collaborator

So the slots section is not used for "ordinary attributes" (the ones that go into __dict__), right? Only for things described in the type object (how?) or in __slots__.

For a tuple, would the custom section contain everything, or would the custom section only contain the length so the items would become slots? (That would be handy for namedtuples too.) But this seems to contradict the idea that the layout would be transparent to the GC -- it would have to know to look in the custom section to find how many slots there are.

I worry about backwards compatibility here -- 3rd party type definitions should remain supported (probably for many releases). I also worry specifically about tp_dictoffset, which appears in the public headers, even though one should call _PyObject_GetDictPtr() instead.

@markshannon
Copy link
Member Author

"Custom" can contain anything. Fully backwards compatible and opaque to the GC.
Tuples would be "custom". Having custom layouts for things like tuples, lists and dicts is fine. They're special.

Code that sets tp_dictoffset is fine, that just means we have a custom layout.

Code that uses tp_dictoffset is problematic only in the case where we have the __dict__ pointer in the header, and the class has instances of differing size, e.g. a tuple.

Which means that we can't use regular object layout for classes that inherit from tuple, bytes, etc.
I don't think that will be a problem in practice.

For example, this class:

class XTuple(tuple):
    pass

would not be able to have the __dict__ pointer in the header, it would have to go in the "custom" section.

@markshannon
Copy link
Member Author

I think this is done. We might want to tweak this later, but that can be its own issue.
FTR, the final layout chosen was #80 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

4 participants