RE: [PATCH 4/4] git-p4: resolve RCS keywords in binary

Joel Holdsworth <jholdsworth@xxxxxxxxxx> · Tue, 14 Dec 2021 13:12:13 +0000

> Makes sense, and I am with others who commented on the previous
> discussion thread that the right approach to take is to take the stuff coming
> from Perforce as byte strings, process them as such and write them out as
> byte strings, UNLESS we positively know what the source and destination
> encodings are.
> 
> And this change we see here, matching with patterns, is perfectly in line with
> that direction.  Very nice.

Not bad. Fortunately, it's not possible for $ characters to appear as a component of a multi-byte UTF-8 character, so it's possible to do the matching byte-wise.

> 
> >          try:
> > -            with os.fdopen(handle, "w+") as outFile, open(file, "r") as inFile:
> > +            with os.fdopen(handle, "wb") as outFile, open(file, "rb") as inFile:
> 
> We seem to have lost "w+" and now it is "wb".  I do not see a reason to make
> outFile anything but write-only, so the end result looks good to me, but is it
> an unrelated "bug"fix that should be explained as such (e.g. "there is no
> reason to make outFile read-write, so instead of using 'w+' just use 'wb'
> while we make it unencoded output by adding 'b' to it")?

I am happy to split this change into a separate patch if this is preferred.

Joel