On Fri, Apr 30, 2021 at 8:33 AM Luke Diamand <luke@xxxxxxxxxxx> wrote: > > Tzadik - is your server unicode enabled or not? That would be > interesting to know: > > p4 counters | grep -i unicode > > I suspect it is not. It's only if unicode is enabled that the server > will convert to/from utf8 (at least that's my understanding). Without > this setting, p4d and p4 are (probably) not doing any conversions. My server is not unicode. These conversions are happening even with a non-Unicode perforce db. I don't think it's the p4d code per se that is doing the conversion, but rather an interaction between the OS and the code, which is different under Linux vs Windows. If you create a trivial C program that dumps the hex values of the bytes it receives in argv, you can see this different behavior: #include <stdio.h> void main(int argc, char *argv[]) { int i, j; char *s; for (i = 1; i < argc; ++i) { s = argv[i]; for (j = 0; s[j] != '\0'; ++j) printf(" %X", (unsigned char)s[j]); printf("\n"); printf("[%s]\n\n", s); } } When built with Visual Studio and called from Cygwin, if you pass in args with UTF-8 encoded characters, the program will spit them out in cp1252. If you compile it on a Linux system using gcc, it will spit them out in UTF-8 (unchanged). I suspect that's what's happening with p4d on Windows vs Linux. In any event, if you look at my patch (v6 is the latest... https://lore.kernel.org/git/20210429073905.837-1-tzadik.vanderhoof@xxxxxxxxx/ ), you will see I have written tests that pass under both Linux and Windows. (If you want to run them yourself, you need to base my patch off of "master", not "seen"). The tests make clear what the different behavior is and also show that p4d is not set to Unicode (since the tests do not change the default setting).