Re: [PATCH 2/8] git_remote_helpers: fix input when running under Python 3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jan 13, 2013 at 04:26:39AM +0100, Michael Haggerty wrote:
> On 01/12/2013 08:23 PM, John Keeping wrote:
>> Although 2to3 will fix most issues in Python 2 code to make it run under
>> Python 3, it does not handle the new strict separation between byte
>> strings and unicode strings.  There is one instance in
>> git_remote_helpers where we are caught by this.
>> 
>> Fix it by explicitly decoding the incoming byte string into a unicode
>> string.  In this instance, use the locale under which the application is
>> running.
>> 
>> Signed-off-by: John Keeping <john@xxxxxxxxxxxxx>
>> ---
>>  git_remote_helpers/git/importer.py | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/git_remote_helpers/git/importer.py b/git_remote_helpers/git/importer.py
>> index e28cc8f..6814003 100644
>> --- a/git_remote_helpers/git/importer.py
>> +++ b/git_remote_helpers/git/importer.py
>> @@ -20,7 +20,7 @@ class GitImporter(object):
>>          """Returns a dictionary with refs.
>>          """
>>          args = ["git", "--git-dir=" + gitdir, "for-each-ref", "refs/heads"]
>> -        lines = check_output(args).strip().split('\n')
>> +        lines = check_output(args).decode().strip().split('\n')
>>          refs = {}
>>          for line in lines:
>>              value, name = line.split(' ')
>> 
> 
> Won't this change cause an exception if the branch names are not all
> valid strings in the current locale's encoding?  I don't see how this
> assumption is justified (e.g., see git-check-ref-format(1) for the rules
> governing reference names).

Yes it will.  The problem is that for Python 3 we need to decode the
byte string into a unicode string, which means we need to know what
encoding it is.

I don't think we can just say "git-for-each-ref will print refs in
UTF-8" since AFAIK git doesn't care what encoding the refs are in - I
suspect that's determined by the filesystem which in the end probably
maps to whatever bytes the shell fed git when the ref was created.

That's why I chose the current locale in this case.  I'm hoping someone
here will correct me if we can do better, but I don't see any way of
avoiding choosing some encoding here if we want to support Python 3
(which I think we will, even if we don't right now).


John
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]