On Mon, Aug 5, 2019 at 2:53 PM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > Carlo Arenas <carenas@xxxxxxxxx> writes: > > > LGTM except from the suggestion below that might make the code more "standard" > > and probably be a good base for a similar PCRE1 fix > >> > >> +static pcre2_general_context *get_pcre2_context(void) > >> +{ > >> + static pcre2_general_context *context; > >> + > >> + if (!context) > >> + context = pcre2_general_context_create(pcre2_malloc, > >> + pcre2_free, NULL); > >> + > >> + return context; > >> +} > > > > instead of using a static variable inside this helper function it > > might be better to use > > one extra field inside the (struct grep_pat *p), where all other > > variables are kept > > > > Additionally to being more consistent will avoid creating the global > > "standard" and "more consistent" are good things, but I am not sure > I should agree with the argument without knowing what you are > comparing your suggested improvement with. Whose standard practice > are we trying to be consistent with? Keeping dynamic resources hooked > to "struct grep_pat" so that (1) different patterns could use different > settings when they desire and (2) the resources are not hidden behind > a function-scope static and can be discarded when we are done with > the pattern, which is the standard in our "grep" subsystem? It was my impression that we were abusing the struct grep_pat to avoid having to deal properly with threading and interlocks. I agree my wording wasn't clear enough and my hinting a little obscure but the original code is racy and it wouldn't be if the "global context" will be initialized/maintained there; as an added benefit it will be straight forward to clear (together with the rest) I am not advocating that as a good design, but also think the code will be shorter (which was another rationale for the proposed change, to avoid introducing yet more bugs and since it was even suggested for inclusion in the next release) > I think general context probably corresponds to a bit higher level > than individual grep_pat. E.g. when running "grep -e foo -e bar", > do we expect resources needed by patterns "foo" and "bar" would want > to be allocated and freed by potentially separate <alloc,free> > function pairs? no with a different design; but currently even if almost all the time we have the same pattern for all workers (ex: -e foo), why are we doing the compilation (plus JIT translation) and creating this table and all other context pointers (plus a jit stack) once per thread? just so we can move forward with a better design will send a proposed patch that does things a little be better as an RFC > > context for the > > most common case (when the locale is either C/POSIX or UTF-8) and therefore > > have a smaller impact on performance. > > I am not sure about the impact on performance, but if it helps us > keep the subsystem reusable by avoiding function-scope static that > we cannot clear, that would be a good thing. But "struct grep_pat" > may be a bit too fine-grained to control general context. the "performance" point I was making was that with the current code the chartable is only created when it is strictly needed (meaning the pattern/haystack will do matching in non UTF-8 mode but with characters with code higher than 127 and therefore MUST agree in a codepage) most of the time (like when using UTF-8) the chartable (and therefore the global context) is not needed (even when using alternate allocators) there is a chance that PCRE2 might perform better with NED, but not in my system and since we haven't been using NED with PCRE2 until this proposed change it might be better to do that independently anyway. Carlo