On Mon, Oct 24, 2016 at 10:28:14AM -0600, Jens Axboe wrote: > How about the below? Bump the timeout to 5 min, 1 min is a little on the > short side, we want normal error handling to be out of the way before > that happens. And additionally, break out if we have been marked as > reaped/exited, so we avoid grabbing the stat mutex again. Yep, that works. I tried a test with just the second change: > + /* > + * If we took too long to shut down, the main thread could > + * already consider us reaped/exited. If that happens, break > + * out and clean up. > + */ > + if (td->runstate >= TD_EXITED) > + break; > + And that's sufficient to solve the problem. Increasing the timeout to 5 minute also would be a good idea, so we can let the worker threads exit cleanly so the reported stats will be completely accurate. Thanks for your help in figuring out this long-standing problem! - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html