On Wed, Nov 27, 2019 at 5:32 PM Yang Zhao <yang.zhao@xxxxxxxxxxxxxx> wrote: > > Try to decode file paths in responses from p4 as soon as possible so > that we are working with unicode string throughout the rest of the flow. > This makes python 3 a lot happier. > > Signed-off-by: Yang Zhao <yang.zhao@xxxxxxxxxxxxxx> > --- > > This is probably the most risky patch out of the set. It's very likely > that I've neglected to consider certain corner cases with decoding of > path data. Yes, this does seem somewhat risky to me. It may go well on platforms that require all filenames to be unicode. And it may work for users who happen to restrict their filenames to valid utf-8. But this abstraction doesn't fit the general problem, so some users may be left out in the cold. I tried multiple times while switching git-filter-repo from python2 to python3, at different levels of pervasiveness, to use unicode more generally. But I mostly gave up; everyone knows files won't necessarily be unicode, but you just can't assume filenames or commit messages or branch or tag names (and perhaps a few other things I'm forgetting) are either. I ended up using bytestrings everywhere except messages displayed to the user, and I only decode at that point. Of course, if perforce happens to only work with unicode filenames then you'll be fine. And perhaps you don't want or need to be as paranoid as I was about what people could do. So I don't know if my experience applies in your case (I've never used perforce myself), but I just thought I'd mention it in case it's useful.