Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-91810: ElementTree: Change default encoding in XML declaration to UTF-8 #91812

Closed
wants to merge 2 commits into from

Conversation

methane
Copy link
Member

@methane methane commented Apr 22, 2022

Fix #91810

@serhiy-storchaka
Copy link
Member

Look at function _get_writer() when encoding is "unicode". If file_or_filename is not a file, it opens a file with the default encoding (which was usually the locale encoding). If file_or_filename is a file, it was a good chance that it is sys.stdout or a file opened with the default encoding. It made sense to use the locale encoding as the declared encoding. You could be right in more cases than wrong.

Now, it would be more correctly to derive the declared encoding from the file encoding. Make _get_writer() returning a pair: the write function and the declared encoding (which should be getattr(file, "encoding", None) or "utf-8" by default).

@methane
Copy link
Member Author

methane commented Apr 25, 2022

You are right. There are a few cases that locale encoding is right choice.
But I still think UTF-8 should be the default encoding for XML.

Now, it would be more correctly to derive the declared encoding from the file encoding. Make _get_writer() returning a pair: the write function and the declared encoding (which should be getattr(file, "encoding", None) or "utf-8" by default).

I don't like this type of idea that works correctly just on some cases.
There are many cases that output don't have encoding. UTF-8 by default is the my favorite behavior.

Anyway, I don't want to discuss more on this pull request. Please continue on the issue instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ElementTree should use UTF-8 for xml declaration.
3 participants