On 27.01.23 18:39, Junio C Hamano wrote: > Junio C Hamano <gitster@xxxxxxxxx> writes: > >> Yes, the "instead of failing hard, fall back" makes sense. Just >> that I do not see why the runtime test is a good thing to have. In >> short, we are not in the business of catching bugs in pcre2_jit >> implementations, so if they say they cannot compile the pattern (I >> would even say I doubt the point of checking the return code to >> ensure it is NOMEMORY), it would be fine to let the interpreter >> codepath to inspect the pattern and diagnose problems with it, or >> take the slow match without JIT. >> >> What am I missing? > > Note that I've seen and recently re-read the discussion that leads to > https://lore.kernel.org/git/f680b274-fa85-6624-096a-7753a2671c15@xxxxxxxxxxxxxx/ Ahh, so ignore my last advise in the previous Email. You already read it. > I suspect that this auto-probe is related to solving "the user > thinks JIT is in use but because of failing JIT the user's pattern > is getting horrible performance" somehow. But I do not think a hard > failure is a good approach to help users in such a situation. Yes, it's exactly trying to detect this case and not cause a regression for users expecting 'git grep' to make use of the JIT. So, beside the W|X memory allocation error, other errors are likely only to point out limitations of the JIT compiler, e.g. the example I gave in https://lore.kernel.org/all/2b04b19a-a2bd-3dd5-6f21-ed0b0ad3e02f@xxxxxxxxxxxxxx/, which is, admitted, a made up example that can easily be worked around by manually prefixing it with '(*NO_JIT)'. It would be a pain having to do that for *every* pattern, but only doing it for the pathological cases should be fine. Otherwise more users would have run into it already and complained about it, no? > After such a failure, the user can prefix "(*NO_JIT)" to the pattern > and retry, or give up the operation altogether and not get a useful > result, but wouldn't it be far more helpful to just fallback as if > (*NO_JIT) was on from the beginning? Sure, if it would be required for *every*, i.e. "normal" patterns. But always doing it even for abusive ones doesn't feel right. > Also I notice that p->pcre2_jit_on is per "struct grep_pat", so it > is not like "once we see a pathological pattern, we turn off JIT > completely for other patterns", right? That is, if you have > > git grep -P -e "$A" -e "$B" > > and we fail to compile "$A" (for whatever reason), we could still > (attempt to) compile "$B". Perhaps $A was too complex or was > incompatible with JIT combined with other options, but $B may be > easy enough to still be JITtable, in which case we would match with > the JITted version of $B with interpreted version of $A, instead of > failing, right? The current version of git would fail hard if it fails to JIT compile "$A". My patch doesn't change that behavior and that's intentional, as I share Ævar's thinking about falling back to the interpreter mode for "complex patterns" (which means pathological cases, really) is a bad idea. While we might be able to compile the pattern and run it in interpreter mode, it'll likely have a *much* higher runtime. Just to get you a glimpse of how much longer, this is what it takes running the pathological pattern from above's example on git.git: $ time git grep -P "(*NO_JIT)$(perl -e 'print "(.)" x 4000')" Binary file git-gui/macosx/git-gui.icns matches Binary file t/t5000/pax.tar matches Binary file t/t5004/big-pack.zip matches Binary file t/t5004/empty-with-pax-header.tar matches real 44m42,150s user 577m14,623s sys 0m46,210s So this grep run eat up ~9.5 *hours* of CPU time. Do we really want to fall back to something like this for the pathological cases? ...Yeah, I don't think so either. Thanks, Mathias