Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] multiline keywords breaks metadata file #4887

Open
frenzymadness opened this issue Mar 17, 2025 · 3 comments · May be fixed by pypa/distutils#347 or #4888
Open

[BUG] multiline keywords breaks metadata file #4887

frenzymadness opened this issue Mar 17, 2025 · 3 comments · May be fixed by pypa/distutils#347 or #4888
Labels
bug Needs Triage Issues that need to be evaluated for severity and status.

Comments

@frenzymadness
Copy link
Contributor

setuptools version

74.1.3 and 76.0.0

Python version

Python 3.13

OS

Fedora Linux

Additional environment information

No response

Description

I recently discovered that if a project has keywords specified as a multiline string, the METADATA file generated from it is invalid in the sense that the headers/body split happens on the wrong line.

Example project: https://foss.heptapod.net/python-libs/passlib/-/blob/branch/stable/setup.py?ref_type=heads#L93
But you can find more like this here: https://grep.app/search?f.lang=Python&q=keywords%3D%22%22%22

Metadata generated with setuptools 74.1.3:

Provides-Extra: bcrypt
Requires-Dist: bcrypt >=3.1.0 ; extra == 'bcrypt'
Provides-Extra: build_docs
Requires-Dist: sphinx >=1.6 ; extra == 'build_docs'
Requires-Dist: sphinxcontrib-fulltoc >=1.2.0 ; extra == 'build_docs'
Requires-Dist: cloud-sptheme >=1.10.1 ; extra == 'build_docs'
Provides-Extra: totp
Requires-Dist: cryptography ; extra == 'totp'

crypt md5-crypt
sha256-crypt sha512-crypt pbkdf2 argon2 scrypt bcrypt
apache htpasswd htdigest
totp 2fa
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: BSD License
Classifier: Natural Language :: English
…

Here, the empty line ends the headers part, and the key/value pair below it are unreachable in a Message instance parsed by email.parser.

Metadata generated with setuptools 76.0.0:

Name: passlib
Version: 1.7.4
Summary: comprehensive password hashing framework supporting over 30 schemes
Home-page: https://passlib.readthedocs.io
Download-URL: https://pypi.python.org/packages/source/p/passlib/passlib-1.7.4.tar.gz
Author: Eli Collins
Author-email: [email protected]
License: BSD
Keywords: password secret hash security
crypt md5-crypt
sha256-crypt sha512-crypt pbkdf2 argon2 scrypt bcrypt
apache htpasswd htdigest
totp 2fa
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: BSD License
…

There is no empty line here but the keyword header is not properly folded as a multiline header according to RFC 822 and therefore the result is the same - unreachable key/value pairs below it.

I know there were several discussions about the formatting of different fields so I'm not sure this is something to fix here or not. The documentation says that keywords can be:

comma-separated string providing descriptive meta-data.

so multiline string does not seem to be prohibited.

Expected behavior

setuptools could either fail with an error message saying that keywords are improperly formatted or automatically replace all newlines with a space.

How to Reproduce

  1. Build a wheel from one of the projects mentioned above.
  2. Unpack the wheel.
  3. Parse the METADATA file manually and check available keys.

Output

See the above examples.

@frenzymadness frenzymadness added bug Needs Triage Issues that need to be evaluated for severity and status. labels Mar 17, 2025
@abravalheri
Copy link
Contributor

Hi @frenzymadness thank you very much for the report, would you like to submit a PR implementing the option setuptools could either fail with an error message saying that keywords are improperly formatted?

@frenzymadness
Copy link
Contributor Author

I'll take a look.

frenzymadness added a commit to frenzymadness/setuptools that referenced this issue Mar 18, 2025
Newlines in `keywords` or `platforms` can break
the produced metadata in PKG-INFO or METADATA files.

Fixes: pypa#4887
@frenzymadness frenzymadness linked a pull request Mar 18, 2025 that will close this issue
2 tasks
@frenzymadness
Copy link
Contributor Author

It turned out it's much easier to fix the behavior and it seems to be backward compatible as well. Also, stripping of the empty characters already happens during the processing if the newer specification with the commas is followed so this change should not break anything, fixes the broken behavior, and will annoy nobody.

frenzymadness added a commit to frenzymadness/distutils that referenced this issue Mar 19, 2025
Newlines in `keywords` or `platforms` can break
the produced metadata in PKG-INFO or METADATA files.

Fixes: pypa/setuptools#4887
frenzymadness added a commit to frenzymadness/distutils that referenced this issue Mar 19, 2025
Newlines in `keywords` or `platforms` can break
the produced metadata in PKG-INFO or METADATA files.

Fixes: pypa/setuptools#4887
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Needs Triage Issues that need to be evaluated for severity and status.
Projects
None yet
2 participants