Re: [PATCH 1/1] pcre2: allow overriding the system allocator

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 5, 2019 at 2:53 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> Carlo Arenas <carenas@xxxxxxxxx> writes:
>
> > LGTM except from the suggestion below that might make the code more "standard"
> > and probably be a good base for a similar PCRE1 fix
> >>
> >> +static pcre2_general_context *get_pcre2_context(void)
> >> +{
> >> +       static pcre2_general_context *context;
> >> +
> >> +       if (!context)
> >> +               context = pcre2_general_context_create(pcre2_malloc,
> >> +                                                      pcre2_free, NULL);
> >> +
> >> +       return context;
> >> +}
> >
> > instead of using a static variable inside this helper function it
> > might be better to use
> > one extra field inside the (struct grep_pat *p), where all other
> > variables are kept
> >
> > Additionally to being more consistent will avoid creating the global
>
> "standard" and "more consistent" are good things, but I am not sure
> I should agree with the argument without knowing what you are
> comparing your suggested improvement with.  Whose standard practice
> are we trying to be consistent with?  Keeping dynamic resources hooked
> to "struct grep_pat" so that (1) different patterns could use different
> settings when they desire and (2) the resources are not hidden behind
> a function-scope static and can be discarded when we are done with
> the pattern, which is the standard in our "grep" subsystem?

It was my impression that we were abusing the struct grep_pat to avoid
having to deal properly with threading and interlocks.

I agree my wording wasn't clear enough and my hinting a little obscure
but the original code is racy and it wouldn't be if the "global context" will be
initialized/maintained there; as an added benefit it will be straight forward to
clear (together with the rest)

I am not advocating that as a good design, but also think the code will be
shorter (which was another rationale for the proposed change, to avoid
introducing yet more bugs and since it was even suggested for inclusion
in the next release)

> I think general context probably corresponds to a bit higher level
> than individual grep_pat.  E.g. when running "grep -e foo -e bar",
> do we expect resources needed by patterns "foo" and "bar" would want
> to be allocated and freed by potentially separate <alloc,free>
> function pairs?

no with a different design; but currently even if almost all the time
we have the same pattern for all workers (ex: -e foo), why
are we doing the compilation (plus JIT translation) and creating this
table and all other context pointers (plus a jit stack) once per
thread?

just so we can move forward with a better design will send a proposed
patch that does things
a little be better as an RFC

> > context for the
> > most common case (when the locale is either C/POSIX or UTF-8) and therefore
> > have a smaller impact on performance.
>
> I am not sure about the impact on performance, but if it helps us
> keep the subsystem reusable by avoiding function-scope static that
> we cannot clear, that would be a good thing.  But "struct grep_pat"
> may be a bit too fine-grained to control general context.

the "performance" point I was making was that with the current code
the chartable is only created when it is strictly needed (meaning the
pattern/haystack will do matching in non UTF-8 mode but with characters
with code higher than 127 and therefore MUST agree in a codepage)

most of the time (like when using UTF-8) the chartable (and therefore the
global context) is not needed (even when using alternate allocators)

there is a chance that PCRE2 might perform better with NED, but not
in my system and since we haven't been using NED with PCRE2 until
this proposed change it might be better to do that independently anyway.

Carlo



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux