On Thu, Dec 15, 2022 at 12:11 AM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > Tao Klerks <tao@xxxxxxxxxx> writes: > > > Again, I'm not attempting to defend the breakage - just outlining why > > I don't see how "using the Perforce variable P4CHARSET" would solve > > anything. > > > >> This new behavior has made it impossible for > >> me to submit changes to files of type "utf8"! Any attempt fails with > >> "patch does not apply" and the erroneously added BOM is the cause. > > > > I will try to understand the "unicode enabled server" behavior today > > or tomorrow and see what options might make sense. > > > >> I propose rolling back the patch that introduced this behavior, > > > > Junio is the expert here and has noted it's a little late for that. I > > obviously defer to his expertise as to git's release and backout > > strategy. > > > > It sounds like, if your conjecture turns out to be correct in that > those P4 users who interact unicode enabled servers would have > P4CHARSET and others don't, we may not need an extra configuration > but pay attention to the P4CHARSET variable (or lack of it) and > switch the behaviour. Yes, I suspect some sort of detection will be required. There appears to be no way to query the server for this "unicode mode" directly, but you can force the client to try connecting in the "wrong" mode for the server, and catch the corresponding error. Ugly, but effective. (the reason it's hard to just test for "P4CHARSET" is that there are several ways to set it, not just the environment, and there are multiple versions of the setting, per-connection or global; setting the global override and testing for failure is likely to be safer than attempting to understand/evaluate the hierarchy of settings) > > I would like to have a go at understanding what the options are (how > > we can get correct and functional behavior for all users), before > > proposing a specific course of action. I have finally managed to start testing with the "unicode enabled server" behavior. So far I've learned that: * Some of our tests around file content encoding handling do fail with the server in this mode (not necessarily because we're doing the wrong thing, but because the server's behavior doesn't match our expectations) these failures may correspond to bugs to be fixed, or tests to be adjusted to match appropriate expectations in this "unicode enabled mode" * Our tests around "git p4 submit" *don't* seem to fail, even on utf-8-bom files - so I have not yet reproduced Tzadik's issue (I keep placing "unicode enabled server" in quotes because I don't want to give the impression that perforce in "normal" mode doesn't handle unicode content - it absolutely does, but... differently.) I definitely need to keep testing around this to understand what the right thing to do for Tzadik (and others like him of course) might be. Tzadik, could you provide any more detail about the failing situation? One piece of info that might be particularly helpful is *what is the exact/full p4 FileType of the problem file?* Thanks, Tao