On Mon, Jun 19 2017, Stefan Beller jotted: >> Now, git-grep could make use of the pluggable error facility added in >> commit c19a490e37 ("usage: allow pluggable die-recursion checks", >> 2013-04-16). > > I think we should do that instead (though I have not looked at the downsides > of this), because... It makes sense to do that in addition to what I'm doing here (or if the approach I'm taking doesn't make sense, some other patch to fix the general issue in the default handler). I'm going to try to get around to fixing the grep behavior in a follow-up patch, this is a fix for the overzealous recursion detection in the default handler needlessly causing other issues. >> >> So let's just set the recursion limit to a number higher than the >> number of threads we're ever likely to spawn. Now we won't lose >> errors, and if we have a recursing die handler we'll still die within >> microseconds. > > how are we handling access to that global variable? > Do we need to hold a mutex to be correct? or rather hope that > it works across threads, not counting on it, because each thread > individually would count up to 1024? It's not guarded by a mutex and the ++ here and the reads from it are both racy. However, for its stated purpose that's fine, even if we're racily incrementing it and losing some updates some will get through, which is good enough for an infinite recursion detection. We don't really care if we die at exactly 1 or exactly 1024. > I would prefer if we kept the number as low as "at most > one screen of lines". In practice this is the case in git, because the programs that would encounter this are going to be spawning less than screenfull of threads, assuming (as is the case) that each thread might print out one error. The semantics of that aren't changed with this patch, the difference is that you're going to get e.g. N repeats of a meaningful error instead of N repeats of either the meaningful error OR "recursion detected in die handler", depending on your luck. I.e. in current git (after a few runs to get an unlucky one): $ git grep -P --threads=10 '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' fatal: recursion detected in die handler fatal: recursion detected in die handler fatal: recursion detected in die handler Or if you're lucky at least one of these will be the actual error: $ git grep -P --threads=10 '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' fatal: recursion detected in die handler fatal: pcre_exec failed with error code -8 fatal: recursion detected in die handler fatal: recursion detected in die handler fatal: recursion detected in die handler But with this change: $ ~/g/git/git-grep -P --threads=10 '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' fatal: pcre2_jit_match failed with error code -47: match limit exceeded fatal: pcre2_jit_match failed with error code -47: match limit exceeded fatal: pcre2_jit_match failed with error code -47: match limit exceeded (The error message is different because I compiled with PCRE v2 locally, instead of the system PCRE v1, but that doesn't matter for this example) Over 1000 runs thish is how that breaks down on my machine, without this patch. I've replaced the recursion error with just "R" and the PCRE error with "P", and shown them in descending order by occurrence, lines without a "P" only printed out the recursion error: $ (for i in {1..1000}; do git grep -P --threads=10 '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' 2>&1|perl -pe 's/^fatal: r.*/R/; s/^fatal: p.*/P/' | tr '\n' ' '; echo; done)|sort|uniq -c|sort -nr|head -n 10 247 R R 136 P R R 122 P R 112 R R R 108 R 59 R P R 54 R P 54 P 31 P R R R 21 R R P There's a long tail I've omitted there of alterations to that. As this shows in >10% of cases we don't print out any meaningful error at all. But with this change: $ (for i in {1..1000}; do ~/g/git/git-grep -P --threads=10 '(*LIMIT_RECURSION=1)(*LIMIT_MATCH=1)-?-?-?---$' 2>&1|perl -pe 's/^fatal: r.*/R/; s/^fatal: p.*/P/' | tr '\n' ' '; echo; done)|sort|uniq -c|sort -nr|head -n 10 377 P P 358 P P P 192 P 63 P P P P 8 P P P P P 2 P P P P P P We will always show a meaningful error, but may of course do so multiple times, which is a subject for a fix in git-grep in particular, but the point is again, to fix the general case for the default handler. Something something sorry about the long mail didn't have time to write a shorter one :) >> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> >> --- >> usage.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/usage.c b/usage.c >> index 2f87ca69a8..1c198d4882 100644 >> --- a/usage.c >> +++ b/usage.c >> @@ -44,7 +44,9 @@ static void warn_builtin(const char *warn, va_list params) >> static int die_is_recursing_builtin(void) >> { >> static int dying; >> - return dying++; >> + static int recursion_limit = 1024; >> + >> + return dying++ > recursion_limit; >> } >> >> /* If we are in a dlopen()ed .so write to a global variable would segfault >> -- >> 2.13.1.518.g3df882009 >>