Re: [PATCH v2] grep: fall back to interpreter if JIT memory allocation fails

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 30 2023, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes:
>
>> If I compile libpcre2 with JIT support I'm expecting Git to use that,
>> and not fall back in those cases where the JIT engine would give up.
>
> The thing is, the reason why their Git has JIT enabled pcre2 for
> many users is not because they choose to compile their own Git for
> themselves because they wanted to play with JIT.  To them, their
> distro and/or their employer gave a precompiled Git, in the hope
> that with JIT would be faster than without JIT when JIT is usable.
>
> In that context, "Speed is a feature in itself" is correct but
> "failing fast, forcing the user to try different things" is not a
> "Speed" feature at all.  It may be interesting only for those who
> are curious to see what pattern was rejected by JIT.  It is
> especially true as (1) we are willing to fall back to interpreter in
> the SELinux senario, and (2) for normal users who want to use Git,
> and not necessarily interested in playing with JIT, there is no
> other recourse than prefixing "I do not want this JITted" to their
> pattern ANYWAY.  Why fail fast and force the user to take the only
> recourse manually, when the machinery already knows what the user's
> only viable alternative is (i.e. falling back to the interpreter)?

Because we have an issue with (1), but not (2). How would (2) happen? So
far I've only seen intentionally pathological patterns designed to
trigger the JIT's limits. I don't think it's worth DWYM-ing that path,
when we're having to assume a lot about the "M" part of that.

>> Pathological regexes are pretty much only interesting to anyone in the
>> context of DoS attacks where they're being used to cause intentional
>> slowdowns.
>
> Exactly.
>
>> Here we're discussing an orthagonal case where the "JIT fails", but
>> rather than some pathological pattern it's because SELinux has made it
>> not work at runtime, and we're trying to tease the two cases apart.
>
> s/and we're/but you're/.  And I do not think you want to.

That s/// is fair, but brings me back to my question above of why we're
trying to solve (2) here.

>> I don't think this is plausible at all per the above, and that we
>> shouldn't harm realistic use-cases to satisfy hypothetical ones.
>
> To me, what you are advocating is exactly the hypothetical ones that
> harm end-users who did not choose to enable JIT themselves.  When JIT
> fails for whatever reason (including the SELinux senario) for them,
> they do not need to be told by Git failing, when the interpreter can
> give them the correct answer.  Wanting to see the result of the
> operation they asked Git to do, while allowing Git to use clever
> optimizations WHEN ABLE, is what I see as realistic use-cases.

I'm saying that the "JIT fails for whatever reason" is
hypothetical. It'll fail because of:

 - The (1) case, where we're categorically unable to run the JIT. Then
   we should proceed as if the JIT isn't available (as we do when it's
   e.g. not compiled into PCRE).

 - The pattern is pathological enough that it's about to take eons to
   execute it (2).

   The lack of bug reports about "hey, my existing 'git grep' pattern
   failed" when the JIT was shipped with v2.14.0 shows that this doesn't
   happen in practice.

 - The case where the API is returning some new error code that's
   unknown to us, let's call that (3).





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux