On Tue, Jun 20, 2017 at 08:49:59PM +0200, Ævar Arnfjörð Bjarmason wrote: > As you can see the third most common case is that we needlessly print > out the warning, i.e. we have only one error anyway, but we can't > guarantee that, so it probably makes sense to emit it. Right, my suggestion actually doesn't help much with the multi-threaded case. The warning is pointless but it will still be racily shown, because we can't tell the difference between races and recursion. So it's better, but fundamentally still pretty ugly. And we'd want to actually fix the real problem. > To reply to your 20170620161514.ygbflanx4pldc7n7@xxxxxxxxxxxxxxxxxxxxx > downthread here (where you talk about setting up a custom die handler > for grep) yeah that would make sense, but as long as we're supplying > this default behavior (and not outlawing using it with pthreads) it > makes sense to get out of our own way with this recursion detection. I actually was more or less proposing that we consider stock die (no custom handler) with pthreads to be outlawed. It sometimes kind of does the right thing, but in a racy way. Probably threads calling die() should consider their error-handling a bit more carefully. That said, in my other mail I arrived at "just put a mutex into die()" which I think would remove the raciness, and give the default behavior of "the first thread to call die will take down the whole process and get its single error printed". That actually seems pretty reasonable. And it makes the "recursion or race?" question go away. I'm not quite sure how to implement it, though. I think you'd want to take the lock in die() itself, because it would make the is-recursing check all the way to the exit an atomic unit. But who is responsible for unlocking then? The point of die_routine() is that it never returns. That's fine if it truly calls exit(), but the threaded die handler in run-command.c does a pthread_exit(). So _somebody_ has to unlock that so other threads don't block if they try to die(). What a mess. > I think my patch (possibly with the fixup above, depending on what we > think about dupe warnings) is just fine to fix this. This is already a > super-rare edge case in grep, and to the extent that it would be a > problem for anyone it's because our paranoid recursion detector totally > hides the error, I don't think it's worth worrying about us printing a > few dupe error messages under threading for something that almost never > happens. > > At least, that's starting to go beyond my bothering to hack on it :) Yes, I think your patch with my suggestion above is a strict improvement over what's there. You'd see the warning any time you might have seen "die is recursing", but at least you _also_ get to see the real error. -Peff