On Thu, Apr 08, 2021 at 12:28:25PM -0700, Tzadik Vanderhoof wrote: > When git-p4 reads the output from a p4 command, it assumes it will be > 100% UTF-8. If even one character in the output of one p4 command is > not UTF-8, git-p4 crashes with: > > File "C:/Program Files/Git/bin/git-p4.py", line 774, in p4CmdList > value = value.decode() UnicodeDecodeError: 'utf-8' codec can't > decode byte Ox93 in position 42: invalid start byte > > I'd like to make a pull request to have it try another encoding (eg > cp1252) and/or use the Unicode replacement character, to prevent the > whole program from crashing on such a minor problem. > > This is especially a problem on the "git p4 clone" command with @all, > where git-p4 needs to read thousands of changeset descriptions, one of > which may have a stray smart quote, causing the whole clone operation > to fail. > > Sound ok? Welcome to the Git community. To start with: I am not a git-p4 expert as such, but seeing that a program is crashing is never a good thing. All efforts to prevent the crash are a step forward. As you mention cp1252 (which is more used under Windows), there are probably lots of system out there which use ISO-8859-15 (or ISO-8859-1) we may have the first whish: Make the encoding/fallback configurable. Let people choose if they want a crash (if things are broken), fallback to cp1252 or one of the other ISO-ISO-8859-x encodings. In that sense: we look forward to a pull-request.