-
-
Notifications
You must be signed in to change notification settings - Fork 31.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Significant Configparser Performance Regression #128641
Comments
@2trvl Thanks for the detailed description and analysis. Adding a cache to the |
@eendebakpt This method doesn't help much, the performance gain is 8.7%. I'll attach a patch made in a hurry, which gives 33%. Although ideally it would be to remove Note that the free-threading version is slowed down a lot by using regular expressions to search for comments and cached_property. Here's how it was in the previous version: Lines 987 to 1005 in 54f7e14
|
Do you mean the issue could arise from
The purpose of the PR was to reduce cyclotomic complexity (essentially have multiple small steps instead of a huge loop block).
I looked at your patch and I think we can perhaps improve it a bit more by computing first |
The patch looks good, except for the part in @2trvl Would you like to make a PR based on the ideas here? |
Free-threading works slower in single-thread, I suppose that a large number of sequential instructions can affect it. Lines 1032 to 1038 in 6116e1b
Apparently, fast creation of immutable objects and writing to their $ python3.13t
1 loop, best of 5: 667 msec per loop
$ python3.13t
1 loop, best of 5: 392 msec per loop GIL Python is not affected by this: $ python3.13
1 loop, best of 5: 343 msec per loop
$ python3.13
1 loop, best of 5: 340 msec per loop
Ok, I'll try to write a decent patch in the next week and maybe create a pull request. |
Hello, I am happy to announce that I have finished developing webbrowser browser finders for Windows and Unix. Today I will start working on a patch for ConfigParser. |
Best b1bc375 On 3.13 : Main 914c232 On 3.13 : Patch 2trvl@a3a696b On 3.13 : On 3.13t: The difference of 25% from the reference time is context switching and writing to My new patch replaces |
Thanks everyone for the respectful conversation and proposed approaches. I hadn't considered performance implications in the previous change other than to take an approach that I expected would have only modest effects. While the old implementation was faster, it was also prone to errors due to its complexity and duplication. I'm happy to see some progress toward restoring some or much of the performance without compromising readability (and maybe improving it). I haven't looked into the specific implementations, but the summaries sound viable. I'll be happy to review a specific PR. |
@2trvl I have a few minor suggestions for the patch, the approach looks good. Can you open a PR with the patch? |
@eendebakpt @jaraco Here is PR #129596 |
The PR is great and provides an excellent demonstration of the concerns. I'd like to validate those changes myself and test other options. I see there is a reproducer described in the original bug. It's not the kind of reproducer I can run without setting up some state, so let me work on setting something up that automates that setup. |
Here's the nixconfig.py file (for integration). |
I created this Dockerfile as a one-file reproducer: FROM ubuntu:noble
RUN apt update
RUN apt upgrade -y
RUN apt install -y software-properties-common
RUN apt-add-repository -y ppa:deadsnakes
RUN apt update
RUN apt install -y wget libarchive-tools
# Install Pythons
RUN apt install -y python3.12 python3.12-dev python3.12-venv
RUN apt install -y python3.13 python3.13-dev python3.13-venv
# Install Python launcher
RUN wget https://github.com/brettcannon/python-launcher/releases/download/v1.0.0/python_launcher-1.0.0-$(uname -p)-unknown-linux-gnu.tar.xz -O - | tar xJ --directory /usr/local --strip-components 1
run wget https://github.com/user-attachments/files/18351245/shortcuts.zip -O - | bsdtar xz
run wget https://gist.githubusercontent.com/jaraco/b94f5314064d4dbb5fa615fd8b31672e/raw/35f8d8e3508fa50df6058e83a859f9bce2d86bc4/nixconfig.py
CMD py -3.13 -m timeit -s 'import nixconfig' 'nixconfig.main()' But when I run it, I'm not getting anywhere close to the 2x degradation reported:
That's a 13% increase. How am I failing to reproduce the degradation? Also, it seems the impact is on the order of 10s of nanoseconds. How was this degradation noticed (what are the practical implications)? |
The nanosecond speed tells me that the files were not read. In your test, read() just excepts OSError because the file was not found. shortcuts.zip stores all files in a subfolder called shortcuts. Change line to I confirmed this regression on WSL, Void Linux, and another Arch Linux machine. |
Yep, rookie mistake. Thanks for that tip. I've applied the suggestion and also stripped down the repro script and added a couple of assertions to ensure the desktop files are present and getting read. FROM ubuntu:noble
RUN apt update
RUN apt upgrade -y
RUN apt install -y software-properties-common
RUN apt-add-repository -y ppa:deadsnakes
RUN apt update
RUN apt install -y wget libarchive-tools
# Install Pythons
RUN apt install -y python3.12 python3.12-dev python3.12-venv
RUN apt install -y python3.13 python3.13-dev python3.13-venv
# Install Python launcher
RUN wget https://github.com/brettcannon/python-launcher/releases/download/v1.0.0/python_launcher-1.0.0-$(uname -p)-unknown-linux-gnu.tar.xz -O - | tar xJ --directory /usr/local --strip-components 1
run wget https://github.com/user-attachments/files/18351245/shortcuts.zip -O - | bsdtar xz --strip-components 1
run wget https://gist.githubusercontent.com/jaraco/b94f5314064d4dbb5fa615fd8b31672e/raw/6e598789ca08ab0cf83ef8febd9e951c88151af8/nixconfig.py
CMD py -3.13 -m timeit -s 'import nixconfig' 'nixconfig.main()' But I'm still seeing execution times in the low nanoseconds.
Aha! I now see my second mistake - in the CLI invocations, I've failed to actually invoke the function. Correcting that, I'm now successfully replicating the issue:
|
I'm working on this issue today. Feel free to reach out in Discord if you wish to interact in real time (I'm not sure which server is best; ping me if you don't have a good one to use). I've worked out how to run the benchmark against the PR or main. I update the Dockerfile to copy the local
|
In the PR, I've backed out some of the changes but managed to retain the bulk of the performance gains. Please take a look and let me know if the compromise if not acceptable (and why). Thanks. |
--------- Co-authored-by: Jason R. Coombs <[email protected]> Co-authored-by: Bénédikt Tran <[email protected]>
Bug report
Bug description:
Hello, @jaraco! The following commit 019143f slows down
ConfigParser.read()
from 2 to 6 times._Line
is created many times when reading a file and the same regular expression is compiled for each object_strip_inline
. This amounts to a 60% speed loss. The simplest solution would be to add a__call__
method and preferably create a_Line(object)
when initializingRawConfigParser
with an empty string value. Or abandon the_Line
object altogether.Another 40% of the performance loss comes from using
cached_property
for_Line.clean
(10%), writing to_ReadState
attributes instead of local variables (15%), and breaking up the previous giant loop into new_handle
functions (15%).I discovered this circumstance when writing an update to webbrowser. I needed to parse hundreds of small .desktop files. At first I didn't understand the reason for the increase in execution time between different distributions, so I developed 3 versions of the program:
(M)
Multiprocessing(O)
Original Routine(T)
ThreadingAnd measured their performance using timeit:
As you can see, performance regression is 2-6 times between 3.11 and 3.13. Isolated comparison of the new and old configparser, which verifies the slowdown of free-threading by 6 times:
I also attach a small reproducible test, just a module calling read():
Archive with the above mentioned .desktop files:
shortcuts.zip
And a program for generating your own .desktop paths on Linux/BSD:
Just run this example with different interpreters and you will see the difference:
At this point I leave the solution to this problem to you, as I have no architectural vision of configparser. However, I am willing to offer my help in proposing solutions for Pull Request.
CPython versions tested on:
3.11, 3.12, 3.13
Operating systems tested on:
Linux
Linked PRs
The text was updated successfully, but these errors were encountered: