-
-
Notifications
You must be signed in to change notification settings - Fork 931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't change the meaning of string literals #554
Conversation
Dear @nvie I deeply apologize for the problems this change has caused you. What is surprising is that no TC fails when reverting this line. What is unsettling is that we have no TCs to detect the problems you had. |
No need to! We've all been there ;)
I thought that might have been the case since it's so isolated, just wanted to double-check this.
I thought of this too, but the bugs are so subtle and hard to replicate that I didn't have a good place to start adding them. The original bugs we have been seeing popped up in a whole different section of the code, so it was quite a journey to end up here, finding this was the root cause. FWIW, I've included the stack trace that triggered out bug hunt here. Example git-fetch output that causes this bug to reveal itself:
Note the "smart quotes" in the 2nd line. That's an example of a repo that will fail. Stack trace does not reveal any key source code position. I had to trace the data at each stack level and see where data was "wrong", then try to trace back how that data came to be that way.
As you can see, this stack trace is of no help finding this bug. |
Thanks for the quick response, though. I went ahead and merged this one myself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unicode_literals
were forgotten from a Q'nD experiment while fighting with Win+PY3 errors.
@Byron Could you perhaps roll a release of GitPython to PyPI that includes this fix? Would be appreciated! |
@nvie Will have to work through more emails and PRs, but will cut a new release when done. That should happen today as well. |
Thanks, no rush! |
@ankostis Yes and no. We thought it was fetch-related for a long time at first, because that's the only place it popped up for us. I think more Git commands could be affected however. The specific bug happened on this line in the end. Because of the This particular one happened in the |
Here's one idea to systematically find all places affected:
Any place where this fails is a similar bug. |
This line singlehandedly has caused a slew of painful bugs for us in production since we upgraded from 2.0.8 to 2.0.9, seeing various UnicodeDecodeErrors on repositories with branches/tags/etc that have non-ASCII characters in them. For most repos that only use the ASCII alphabet, implicit encoding and decoding is now happening, and most of the time we get lucky because they accidentally just work.
After reverting this one liner, our production issues all went away.
@ankostis What's the reason this line was included in f11fdf1? It's the only module in the GitPython code base now that uses unicode literals, which makes reasoning about the code related to strings in this module rather confusing. Could we perhaps just change the strings that should be unicode for your patch and mark those with
u'...'
string literals explicitly rather than changing the meaning of all strings at once?