On Sun, Jan 13, 2013 at 04:26:39AM +0100, Michael Haggerty wrote: > On 01/12/2013 08:23 PM, John Keeping wrote: >> Although 2to3 will fix most issues in Python 2 code to make it run under >> Python 3, it does not handle the new strict separation between byte >> strings and unicode strings. There is one instance in >> git_remote_helpers where we are caught by this. >> >> Fix it by explicitly decoding the incoming byte string into a unicode >> string. In this instance, use the locale under which the application is >> running. >> >> Signed-off-by: John Keeping <john@xxxxxxxxxxxxx> >> --- >> git_remote_helpers/git/importer.py | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/git_remote_helpers/git/importer.py b/git_remote_helpers/git/importer.py >> index e28cc8f..6814003 100644 >> --- a/git_remote_helpers/git/importer.py >> +++ b/git_remote_helpers/git/importer.py >> @@ -20,7 +20,7 @@ class GitImporter(object): >> """Returns a dictionary with refs. >> """ >> args = ["git", "--git-dir=" + gitdir, "for-each-ref", "refs/heads"] >> - lines = check_output(args).strip().split('\n') >> + lines = check_output(args).decode().strip().split('\n') >> refs = {} >> for line in lines: >> value, name = line.split(' ') >> > > Won't this change cause an exception if the branch names are not all > valid strings in the current locale's encoding? I don't see how this > assumption is justified (e.g., see git-check-ref-format(1) for the rules > governing reference names). Yes it will. The problem is that for Python 3 we need to decode the byte string into a unicode string, which means we need to know what encoding it is. I don't think we can just say "git-for-each-ref will print refs in UTF-8" since AFAIK git doesn't care what encoding the refs are in - I suspect that's determined by the filesystem which in the end probably maps to whatever bytes the shell fed git when the ref was created. That's why I chose the current locale in this case. I'm hoping someone here will correct me if we can do better, but I don't see any way of avoiding choosing some encoding here if we want to support Python 3 (which I think we will, even if we don't right now). John -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html