Re: [PATCH v2] grep: fall back to interpreter if JIT memory allocation fails

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jan 29 2023, Junio C Hamano wrote:

> Mathias Krause <minipli@xxxxxxxxxxxxxx> writes:
>
>> ... While we might be able to compile the pattern and run it in
>> interpreter mode, it'll likely have a *much* higher runtime.
>> ...
>> So this grep run eat up ~9.5 *hours* of CPU time. Do we really want to
>> fall back to something like this for the pathological cases? ...Yeah, I
>> don't think so either.
>
> You may not, but I do not agree with you at all.  The code should
> not outsmart the user in such a case.

It's the falling back in the nominal case that would be outsmarting the
user.

If I compile libpcre2 with JIT support I'm expecting Git to use that,
and not fall back in those cases where the JIT engine would give up.

> Even if the pattern the user came up with is impossible to grok for
> a working JIT compiler, and it might be hard to grok for the
> interpreter, what is the next step you recommend the user if you
> refuse to fall back on the interprete?  "Rewrite it to please the
> JIT compiler"?

I'd argue that it's pretty much impossible to unintentionally write such
pathological patterns, the edge cases where e.g. the JIT would run out
of resources v.s. the normal engine are a non-issue for any "normal"
use.

Pathological regexes are pretty much only interesting to anyone in the
context of DoS attacks where they're being used to cause intentional
slowdowns.

Here we're discussing an orthagonal case where the "JIT fails", but
rather than some pathological pattern it's because SELinux has made it
not work at runtime, and we're trying to tease the two cases apart.

> If that is the best pattern the user can produce to solve the
> problem at hand, being able to give the user an answer in 9 hours is
> much better than not being able to give anything at all.

Speed is a feature in itself, and in a lot of cases (e.g. user-supplied
patterns vulnerable to a DoS attack) continuing on the slow path is much
worse.

Even just using my terminal for ad-hoc "git grep", I'd *much* rather get
an early error about the pattern exceeding JIT resources than continuing
on the fallback path.

If I had somehow written one by accident (and this is stretching
credulity) you can usually apply some minor tweaks to the pattern, and
then execute it in seconds instead of minutes/hours.

> Maybe after waiting for 5 minutes, the user gets bored and ^C, or
> without killing it, open another terminal and try a different
> patern, and in 9 hours, perhaps comes up with an equivalent (or
> different but close enough) pattern that happens to run much faster,
> at which time the user may kill the original one.  In any of these
> cases, by refusing to run, the code is not doing any service to the
> user.

I don't think this is plausible at all per the above, and that we
shouldn't harm realistic use-cases to satisfy hypothetical ones.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux