On Sun, Jan 29 2023, Junio C Hamano wrote: > Mathias Krause <minipli@xxxxxxxxxxxxxx> writes: > >> ... While we might be able to compile the pattern and run it in >> interpreter mode, it'll likely have a *much* higher runtime. >> ... >> So this grep run eat up ~9.5 *hours* of CPU time. Do we really want to >> fall back to something like this for the pathological cases? ...Yeah, I >> don't think so either. > > You may not, but I do not agree with you at all. The code should > not outsmart the user in such a case. It's the falling back in the nominal case that would be outsmarting the user. If I compile libpcre2 with JIT support I'm expecting Git to use that, and not fall back in those cases where the JIT engine would give up. > Even if the pattern the user came up with is impossible to grok for > a working JIT compiler, and it might be hard to grok for the > interpreter, what is the next step you recommend the user if you > refuse to fall back on the interprete? "Rewrite it to please the > JIT compiler"? I'd argue that it's pretty much impossible to unintentionally write such pathological patterns, the edge cases where e.g. the JIT would run out of resources v.s. the normal engine are a non-issue for any "normal" use. Pathological regexes are pretty much only interesting to anyone in the context of DoS attacks where they're being used to cause intentional slowdowns. Here we're discussing an orthagonal case where the "JIT fails", but rather than some pathological pattern it's because SELinux has made it not work at runtime, and we're trying to tease the two cases apart. > If that is the best pattern the user can produce to solve the > problem at hand, being able to give the user an answer in 9 hours is > much better than not being able to give anything at all. Speed is a feature in itself, and in a lot of cases (e.g. user-supplied patterns vulnerable to a DoS attack) continuing on the slow path is much worse. Even just using my terminal for ad-hoc "git grep", I'd *much* rather get an early error about the pattern exceeding JIT resources than continuing on the fallback path. If I had somehow written one by accident (and this is stretching credulity) you can usually apply some minor tweaks to the pattern, and then execute it in seconds instead of minutes/hours. > Maybe after waiting for 5 minutes, the user gets bored and ^C, or > without killing it, open another terminal and try a different > patern, and in 9 hours, perhaps comes up with an equivalent (or > different but close enough) pattern that happens to run much faster, > at which time the user may kill the original one. In any of these > cases, by refusing to run, the code is not doing any service to the > user. I don't think this is plausible at all per the above, and that we shouldn't harm realistic use-cases to satisfy hypothetical ones.