-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEAT] [RFC] Class builder API for high performance bindings #2810
Comments
If we can really get this kind of performance improvements, I am very much interested. How confident are you that this considerable effort is going in the right direction? I must say that I'm not a fan of "duplicate" API. Reworking the internals of I was also thinking about moving away from heap types and creating static ones. That would require type objects with static lifetime and I don't think pybind can do that without leaking the type objects. |
@bstaletic Actually I'm not sure... I'm relatively confident that not using |
That's a given. Conversions can be tamed with extensive usage of
This the one I was interested the most in. I'm sure I don't need to tell you that actual numbers that show where pybind11 is spending the time is going to help. Either in convincing others that this is a good idea or saving time by not going all in for tiny gains. |
Actually considering slots for magic methods complicates things so I'm thinking about just ignoring it for the time being. Let me create a prototype that handles method and attribute definition, but does not support overloading. I'll then compare it to direct bindings to see how it works. |
Update: I created a very rough prototype and compared the performance of Pybind11 bindings, proposed API bindings and direct bindings (Python C APIs only) on four situations:
And the benchmark result is as follows:
Note that the prototype for the proposed Another problem with the proposed API is that it uses the closure feature from |
You can say that again. I was really expecting some more encouraging numbers. The "direct" row in the table shows that there is quite an overhead to what pybind is doing... For the record, I'm not making any calls right now. I'd like to hear from others first. |
Thanks for the overview of your proposal, @lqf96 (and apologies for my delayed feedback) ! The proof-of-concept route indeed seems like the right route to take and to clarify some things. Given the most recent results, some profiling might also be able to help. Apart from that, it will also really be up to @wjakob to judge whether such an interface is in the scope of pybind11. The kind of abstractions that pybind11 provides will of course always have some overhead, but I agree that 5-20x is a lot. |
Indirection here means that, as I said above, every pybind11 method is implemented as:
Additionally, each property also has this whole thing wrapped again in a The benchmark result is surprising to me because I initially thought these multiple levels of wrapping are the culprit of pybind11 call overhead, however replacing them with direct |
@YannickJadoul Regardless of the benchmark result I got, I still want to move |
I currently don't really see the advantage of that outweighing all the potential merge conflict and confusion, to be fair. |
@YannickJadoul From the experimental results, it seems that improving the efficiency of
I'm fine with not moving the code for |
Doesn't that disable |
Why? Keyword arguments can be supported with |
Oh, I wasn't aware you can mix |
No, if that's part of a rewrite, then refactoring can make perfect sense (maybe not even to detail, but just to For the record, I've sent @wjakob a message, since he, as original author and project leader, is still the one to call shots on "in scope"/"out of scope" for pybind11.
This would disable (or massively complicate) overload resolution, though? But curious to see if that's not the case. |
Motivation
Pybind11 provides easy-to-use APIs for writing Python bindings. However, these bindings are often implemented with multiple level of indirections, which can have negative impact on the performance. For example,
class_::def_property
andclass_::def_property_readonly
are currently implemented as:PyCFunction
wrapper capturing a capsule holding adetail::function_record
instancemethod
property
stored in the class dictionary.Compared to a
getset_descriptor
directly implemented withPyGetSetDef
, this can be 5x-20x slower both in terms of execution time and instruction counts. Similarly, constructors implemented in pybind11 can be slower than a low-level constructor function directly bound to type with thetp_init
slot. Both problems show the necessity of having a set of high performance binding APIs for pybind11.Design
The new API is called "class builder" because it employs the builder pattern:
To create high-performance bindings for a class, we would call
pybind11::class_builder
instead ofpybind11::class_
with almost exactly the same parameters. This gives us a class builder, which we can use to define a series of methods and attributes (note these aregetset_descriptor
s notproperty
s). After we define all the members of the class, we call thebuild
method, which will create the Python class and return apybind11::class_<...>
instance.Internally,
def_attr
anddef_attr_readonly
is implemented as adding a new entry to thetp_getset
slot, anddef
,def_class
anddef_static
is implemented as adding a new entry to thetp_methods
slot. Thebuild
method will callPyType_Ready
method on the type being built, and wraps the readyPyTypeObject
in apybind11::class_<...>
instance.Misc
cpp_function
into two parts: object wrapper and dispatching logic. For the class builder API, the previous part isn't needed. We'll however keep it for backward compatibility.class_builder
API, or should we just rework the internals ofpybind11::class_
? Personally I think it depends on whether we are allowed to add newPyMethodDef
andPyGetSetDef
after callingPyType_Ready
. I guess the answer is probably no so a separate set of APIs is likely needed.detail::function_record
, and it is challenging to make them work withPyMethodDef
, becausePyMethodDef
has no field to store the closure pointer (PyGetSetDef
has a closure pointer and does not have this problem). Perhaps we will need the closure functionality fromlibffi
to solve this problem?The text was updated successfully, but these errors were encountered: