On Thu, Dec 21, 2023 at 05:51:24AM -0500, Jeff King wrote: > I suspect this is a race in LSan caused by a thread calling exit() while > other threads are spawning. Here's my theory. > > When a thread is spawned, LSan needs to know where its stack is (so it > can look for points to reachable memory). It calls pthread_getattr_np(), > which gets an attributes object that must be cleaned up with > pthread_attr_destroy(). Presumably it does this shortly after. But > there's a race window where that attr object is allocated and we haven't > yet set up the new thread's info. If another thread calls exit() then, > LSan will run but its book-keeping will be in an inconsistent state. Thanks for digging. I agree with your theory, and am annoyed with how clever it is ;-). > So it's a pretty easy fix, though I don't love it in general. Every > place that spawns multiple threads that can die() would need the same > treatment. And this isn't a "real" leak in any reasonable sense; it only > happens because we're exiting the program directly, at which point all > of the memory is returned to the OS anyway. So I hate changing > production code to satisfy a leak-checking false positive. > > OTOH, dealing with false positives is annoying for humans, and the > run-time cost should be negligible. We can work around this one, and > avoid making the same change in other spots unless somebody sees a racy > failure in practice. Yeah... I share your thoughts here as well. It's kind of gross that we have to touch production code purely to appease the leak checker, but I think that the trade-off is worth it since: - the false positives are annoying to diagnose (as you said, and as evidenced by the time that you, Junio, and myself have sunk into discussing this ;-)). - the run-time cost is negligible. So I think that this is a good change to make, and I'm happy to see it go through. I don't think we should necessarily try too hard to find all spots that might benefit from a similar change, and instead just apply the same treatment if/when we notice false positives in CI. Thanks, Taylor