Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes: > This v3 has a new patch (3/10) that I believe fixes the regression on > MinGW Johannes noted in > https://public-inbox.org/git/nycvar.QRO.7.76.6.1907011515150.44@xxxxxxxxxxxxxxxxx/ > > As noted in the updated commit message in 10/10 I believe just > skipping this test & documenting this in a commit message is the least > amount of suck for now. It's really an existing issue with us doing > nothing sensible when the log/grep haystack encoding doesn't match the > needle encoding supplied via the command line. Is that quite the case? If they do not match, not finding the match is the right answer, because we are byte-for-byte matching/searching IIUC. > We swept that under the carpet with the kwset backend, but PCRE v2 > exposes it. Is it exposing, or just showing the limitation of the rewritten implementation where it cannot do byte-for-byte matching/searching as we used to be able to? Without having a way to know what encoding is used on the command line, there is no sensible way to reencode them to match the haystack encoding (even when it is known), so "you got to feed the strings in the same encoding, as we are going to match/search byte-for-byte" is the only sensible way to work, given the design space, I would think. Not that it is all that useful to be able to match/search byte-for-byte, of course, so I am OK if we punt with these tests, but I'd prefer to see us admit we are punting when we do ;-).