On Wed, May 24, 2017 at 7:17 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes: > >> diff --git a/grep.c b/grep.c >> index 1157529115..49e9aed457 100644 >> --- a/grep.c >> +++ b/grep.c >> @@ -351,6 +351,9 @@ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt) >> const char *error; >> int erroffset; >> int options = PCRE_MULTILINE; >> +#ifdef PCRE_CONFIG_JIT >> + int canjit; >> +#endif > > Is "canjit" a property purely of the library (e.g. version and > compilation option), or of combination of that and nature of the > pattern, or something else like the memory pressure? > > I am wondering if it is worth doing something like this: > > static int canjit = -1; > if (canjit < 0) { > pcre_config(PCRE_CONFIG_JIT, &canjit); > } > > if it depends purely on the library linked to the process. It purely depends on how the the library, was compiled. I just wrote it like that because compiling the pattern is not a hot codepath (i.e. we call this max 8 or so times or so, whereas exec will be called thousands/millions/billions of times), so trying to avoid calling this trivial function seemed pointless. But looking at this again it would be simpler to combine what you're suggesting with just passing a pointer to *.pcre[12]_jit_on directly, skipping the canjit variables. >> @@ -365,9 +368,20 @@ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt) >> if (!p->pcre1_regexp) >> compile_regexp_failed(p, error); >> >> - p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 0, &error); >> + p->pcre1_extra_info = pcre_study(p->pcre1_regexp, PCRE_STUDY_JIT_COMPILE, &error); >> if (!p->pcre1_extra_info && error) >> die("%s", error); >> + >> +#ifdef PCRE_CONFIG_JIT >> + pcre_config(PCRE_CONFIG_JIT, &canjit); >> + if (canjit == 1) { >> + p->pcre1_jit_stack = pcre_jit_stack_alloc(1, 1024 * 1024); >> + if (!p->pcre1_jit_stack) >> + die("BUG: Couldn't allocate PCRE JIT stack"); > > I agree that dying is OK, but as far as I can tell, this is not a > BUG (there is no error a programmer can correct by a follow-up > patch); please do not mark it as such (it is likely that we'll later > do a tree-wide s/die("BUG:/BUG("/ and this will interfere). Makes sense. Looks like the convention for this sort of thing is to just do s/BUG: //, e.g. the code in wrapper.c does that. >> + pcre_assign_jit_stack(p->pcre1_extra_info, NULL, p->pcre1_jit_stack); >> + p->pcre1_jit_on = 1; > > Contrary to what I wondered about "canjit" above, I think it makes > tons of sense to contain the "is JIT in use?" information in "struct > grep_pat" and not rely on any global state. Not that we are likely > to want to be able to JIT some patterns while not doing others. So > I agree with the design choice of adding pcre1_jit_on field to the > structure. > > But then, wouldn't it make more sense to do all of the above without > the canjit variable at all? i.e. something like... > > #ifdef PCRE_CONFIG_JIT > pcre_config(PCRE_CONFIG_JIT, &p->pcre1_jit_on); > if (p->pcre1_jit_on) > ... stack thing ... > #else > p->pcre1_jit_on = 0; > #endif *Nod* >> +#ifdef PCRE_CONFIG_JIT >> + if (p->pcre1_jit_on) { >> + pcre_free_study(p->pcre1_extra_info); >> + pcre_jit_stack_free(p->pcre1_jit_stack); >> + } else >> +#endif >> + /* PCRE_CONFIG_JIT !p->pcre1_jit_on else branch */ >> pcre_free(p->pcre1_extra_info); >> + >> pcre_free((void *)p->pcre1_tables); > > It is very thoughtful to add a blank line here (and you did the same > in another similar hunk), but I have a feeling that it is still a > bit too subtle a hint to signal to the readers that these two > pcre_free()s fire differently, i.e. the former does not fire if jit > is on but the latter always fires. > > Would this be a bit safer while being not too ugly to live, I wonder? > > #ifdef PCRE_CONFIG_JIT > if (p->pcre1_jit_on) { > pcre_free_study(p->pcre1_extra_info); > pcre_jit_stack_free(p->pcre1_jit_stack); > } else > #endif > { > /* PCRE_CONFIG_JIT !p->pcre1_jit_on else branch */ > pcre_free(p->pcre1_extra_info); > } > pcre_free((void *)p->pcre1_tables); > Makes sense. I'll change it.