Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-39102: Increase Enum performance up to 10x times (3x average) #17669

Closed
wants to merge 24 commits into from

Conversation

Bobronium
Copy link

@Bobronium Bobronium commented Dec 20, 2019

Increase Enum performance

https://bugs.python.org/issue39102

Partial benchmark result, to see more, check full benchmark result and source code

ATTR_ACCESS:
Testing with 10000 repeats, result is average of 10 tests:

    >>> NewEnum.foo  # 0.6069349023270748 ms
    <NewEnum.foo: 1>

    >>> OldEnum.foo  # 4.5089948174334005 ms
    <OldEnum.foo: 1>

    >>> NewEnum.foo.value  # 0.965516420197984 ms
    1

    >>> OldEnum.foo.value  # 5.234090615261106 ms
    1

    >>> NewEnum.value.value  # 0.6496817059283957 ms
    3

    >>> OldEnum.value.value  # 19.718928336346387 ms
    3

    >>> try:     NewEnum.value.value = 'new' except: pass  # 4.842046525117963 ms
    AttributeError("Can't set attribute")

    >>> try:     OldEnum.value.value = 'new' except: pass  # 24.484108522222897 ms
    AttributeError("can't set attribute")


NewEnum: total: 7.0641796 ms, average: 1.1652198 ms (Fastest)
OldEnum: total: 53.9461223 ms, average: 10.3317085 ms, ~ x8.87 times slower than NewEnum 

Full benchmark result and source code

Changes

  • Optimized Enum.__new__
  • Remove EnumMeta.__getattr__,
  • Store Enum.name and .value in members __dict__ for faster access
  • Deprecate getting values from Enum._name_ and ._value_, should use public .name and .value now
  • Replace Enum._member_names_ with ._unique_member_map_ for faster lookups and iteration
  • Replace _EmumMeta._member_names and ._last_values with .members mapping (deprecation warning on using old attrs)
  • Add _str_, _repr_ and _invert_ attrs to cache results of functions of the same dunder names (x35 speed boost) (948d3de)
  • Various minor improvements
  • Add support for direct setting and getting class attrs on DynamicClassAttribute without need to use slow __getattr__

https://bugs.python.org/issue39102

@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed the PSF contributor agreement (CLA).

CLA Missing

Our records indicate the following people have not signed the CLA:

@MrMrRobat

For legal reasons we need all the people listed to sign the CLA before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

If you have recently signed the CLA, please wait at least one business day
before our records are updated.

You can check yourself to see if the CLA has been received.

Thanks again for the contribution, we look forward to reviewing it!

super().__setitem__(key, value)

@property
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the deprecation part could be addressed in a separate commit.

Copy link
Author

@Bobronium Bobronium Dec 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand how it should look like, though. Should I change anything now?

Just realized that _value_ and _name_ attrs remained independent from
name and value and could be different. This commit fixes it.
Bobronium added a commit to Bobronium/fastenum that referenced this pull request Dec 22, 2019
Run builtin python tests directly from python test module
Refactor patch applying
Copy link
Member

@ethanfurman ethanfurman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry a proper review is taking so long -- there is a lot to go over here.
What I can say so far:

  • name and value must show up in a member's dir();
  • _leading_underscore_names and plain_names must not be used in EnumMeta nor Enum as then the user cannot use those same names -- that's why _sunder_ and __dunder__ names are used instead;

I do have some private code that I haven't merged in yet that does away with DynamicClassAttribute -- I'll try to get that done in the next few weeks. I think a lot of the work you have done here will still be applicable.

Thank you for your efforts!

Copy link
Author

@Bobronium Bobronium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feedback needed

value = self._value_
if self.name is not None:
return f're.{self.name}'
value = self.value
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use _repr_ cache here

for k, v in c.__dict__.items()
if isinstance(v, DynamicClassAttribute)}

# Reverse value->name map for hashable values.
enum_class._value2member_map_ = {}

# used to speedup __str__, __repr__ and __invert__ calls when applicable
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite the fact that this works pretty well and gives a great performance boost, I'm not sure that it was implemented right, so feedback is highly appreciable.

Comment on lines +603 to +612
# TODO: Maybe remove try/except block and setting __context__ in this case?
try:
exc = None
result = cls._missing_(value)
except Exception as e:
exc = e
result = None
# Huge boost for standard enum
if cls._missing_ is Enum._missing_:
raise
else:
e.__context__ = ValueError(f'{value!r} is not a valid {cls.__qualname__}')
raise
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this try/except block code runs twice slower for non-enum values checks, then without it .
Do we really need to bother checking all errors and add ValueError to context , knowing it will happen either in standard Enum._missing_ or in user-defined code? In fact, Flag._missing_ also raises same error as Enum._missing_ and handled and rerised too.

@Bobronium
Copy link
Author

@ethanfurman, any chance this PR will be reviewed in the near future?

@Bobronium Bobronium requested a review from ethanfurman May 26, 2020 15:06
@TylerYep
Copy link

Not a core team member, but I would really love this performance boost - any chance this will be reviewed soon?

@ethanfurman
Copy link
Member

I have some other changes planned, which include some speed ups. We'll see the changes in 3.10 (3.9 is in bug-fix only prior to first release).

@Bobronium
Copy link
Author

I would really love this performance boost

@TylerYep, you can check out enum patch I've created for current python versions, it’s a drop-in solution that enables changes from this PR along with some other improvements: https://github.com/MrMrRobat/fastenum,
I’d appreciated any feedback on this :)

looks like I haven’t uploaded it on PyPi though. Maybe I’ll upload it on this weekend, but for now it’s still should be installable via pip:

pip install git+https://github.com/MrMrRobat/fastenum.git

@belm0
Copy link
Contributor

belm0 commented Nov 23, 2020

This PR takes the approach of caching things, perhaps that is best.

However, when I looked at the Flag code several months ago (being unaware of this PR), I kept wondering why carry such a heavy and complicated _decompose() function in the first place? A flag is just bits, right? So I sought to eliminate it.

Proof of concept is here: belm0#1

However, this refactor is thwarted by some odd edge behavior of Flag and IntFlag, which in my opinion no one should be relying on. To proceed with it, there would have to be will among maintainers to fix/deprecate these things:

  1. repr of a compound value contains redundant compound values (e.g. given {A=1, B=2, C=4, AB=A|B}, repr(A|B|C) is C|AB|B|A)
  2. allowing incoherent Flag definitions which don't decompose into individual bits (e.g. {B=3, C=4})
  3. IntFlag invert behavior, which has a one-to-many mapping into the negative value space (e.g. given {A=1, B=2}, ~B is <Bar.A: -3> rather than <Bar.A: 1> (and furthermore ~B == A is False)

@ethanfurman
Copy link
Member

@belm0 Thanks for the concise summary.

1 is a bug, and needs to be changed.
2 needs to stay.
3 needs both -- a Flag should behave as you say, but IntFlag, because it is based on integers, needs to behave like an integer would.

@belm0
Copy link
Contributor

belm0 commented Nov 23, 2020

3 needs both -- a Flag should behave as you say, but IntFlag, because it is based on integers, needs to behave like an integer would.

It was a design choice, wasn't it? IntFlag derives from int and Flag. When implementing invert, it was decided to make it act like int. It doesn't need to be that way, and I suspect most use cases would rather have the Flag behavior.

From my understanding, the primary motivation ofIntFlag was to have a Flag that could be passed into legacy or 3rd party API. Int-ness should have been secondary to Flag-ness. If there is some odd API that takes such a value and does operations other than &, |, ^, or expects~to act as a signed invert, go ahead and cast to int() before passing it in.

@ethanfurman
Copy link
Member

@belm0 If a concrete data type is mixed into an Enum (such as str, int, CustomDataType, etc.) that data type is the most important thing about it. It would be more accurate to say "if an Enum is mixed into a concrete data type".

The primary motivation behind IntFlag was to be able to add Enum representation to "magic ints" that are already being used as flags -- so IntFlag behavior has to exactly match (or as nearly as possible) the behavior that an int would show.

What would be most helpful at this point is some actual code samples of "magic" integer flags that have inversion applied to them, and looking at how the surrounding code is treating the result of that operation.

Hmmm.... - just thinking out loud here, but it seems to me that the problem is that integer flags are (usually? always?) unsigned, but Python doesn't have that particular concept. Perhaps the thing to do is to make Flag reveal it's int roots (so they could be used as indices, in calculations, etc.) but have it represent the unsigned version, and IntFlag could then remain the signed version.

Thoughts?

@belm0
Copy link
Contributor

belm0 commented Nov 24, 2020

@belm0 If a concrete data type is mixed into an Enum (such as str, int, CustomDataType, etc.) that data type is the most important thing about it. It would be more accurate to say "if an Enum is mixed into a concrete data type".

The primary motivation behind IntFlag was to be able to add Enum representation to "magic ints" that are already being used as flags -- so IntFlag behavior has to exactly match (or as nearly as possible) the behavior that an int would show.

I don't agree with these rationales-- it probably won't be productive for us to argue it now.

What would be most helpful at this point is some actual code samples of "magic" integer flags that have inversion applied to them, and looking at how the surrounding code is treating the result of that operation.

Hmmm.... - just thinking out loud here, but it seems to me that the problem is that integer flags are (usually? always?) unsigned, but Python doesn't have that particular concept. Perhaps the thing to do is to make Flag reveal it's int roots (so they could be used as indices, in calculations, etc.) but have it represent the unsigned version, and IntFlag could then remain the signed version.

The only reasonable way to do flag negation with signed integers is to negate and then mask with the valid bits. This is effectively what Flag does, and it's what IntFlag should do as well. Either no one is using IntFlag invert, or if they are they're having to do such masking manually.

No one wants or is expecting a flag/enum component to have multiple representations (<Bar.A: 1>, <Bar.A: -3>, <Bar.A: -5>, ...).

@ethanfurman
Copy link
Member

@belm0 : As I alluded to earlier, IntEnum needs to be a drop-in replacement for currently in-use magic ints -- and all the locations I looked at in the stdlib do indeed mask off the negative bits after an inversion.

As you say, it is the only reasonable way to use negated flags.

I'm more comfortable making that change now, thank you for your feedback.

@belm0
Copy link
Contributor

belm0 commented Dec 1, 2020

allowing incoherent Flag definitions which don't decompose into individual bits (e.g. {B=3, C=4})

2 needs to stay.

What is the use case for this one?

Given something like {B=3, C=4}, what will be the defined behavior of the new iteration added to 3.10?
What about {A=1, B=3, C=4}? (the iteration is expected to produce individual bit members)

Copy link
Member

@ethanfurman ethanfurman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 3.10 and supporting __init_subclass__ I've made several structural changes to the enum implementation. In doing that I included many of the optimizations that you spoke of. If you are not already in the ACK file I'll add you there in my next commit.

Improvements made since your PR:

  • enum.property, a subclass of DynamicClassAttribute, is used : it avoids the slow __getattr__ lookup
  • decompose has been removed (with belm0's help)
  • _invert_ is cached

Things that likely will not change:

  • _name_ as an enum.property
  • _value_ as an enum.property
  • __member_map__ as the public API for accessing enum internals

Things I am open to:

  • using set() instead of lists
  • using internal dict()s instead of matching list()s to track names/values
  • using f-strings
  • caching __member_map__ instead of creating a new one on each access

If you would like to make another PR based on current 3.10 code based on above comments I would be happy to review and commit it. If you don't have time or the desire to tackle this again, you still have my thanks for the ideas and your help in the improvements that have made it in so far.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@belm0
Copy link
Contributor

belm0 commented Apr 11, 2021

@MrMrRobat would you mind running your enum benchmark against head?

https://gist.github.com/MrMrRobat/94b5ca3e4098669d93a6127e70359307

@ethanfurman
Copy link
Member

Creating of enums got slower with the extra dance I had to do to support __init_subclass__ properly, but using enum should be a bit faster now.

@Bobronium
Copy link
Author

@ethanfurman, thanks for the feedback! Should I open another PR, or will it be enough to just merge master changes into this one?

__member_map__ as the public API for accessing enum internals

caching __member_map__ instead of creating a new one on each access

I don't quite understand what member map you are referring to, could you please clarify?

@Bobronium
Copy link
Author

@belm0, sure, I will.

@ethanfurman
Copy link
Member

@MrMrRobat Sorry, I meant __members__. I think it would be easier to make a new patch based on the 3.10 code, and force push to this PR. A lot has changed, and many of the improvements you have here are either already incorporated or not appropriate -- for example, _name_ and _value_ being accessed via descriptors is that way to provide some protection against modifying them, and that's not going to change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting changes performance Performance or resource usage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants