On Thu, Oct 20, 2016 at 08:22:00AM -0600, Jens Axboe wrote: > > So what's happening is that generic/299 is looping in the > > fallocate/truncate loop until fio exits, but since fio never exits, so > > it ends up looping forever. > > I'm setting up the GCE now, I've had the tests running for about 24h now > on another test box and haven't been able to trigger any hangs. I'll > match your setup as closely as I can, hopefully that'll work. Any luck reproducing the problem? On Wed, Oct 19, 2016 at 08:06:44AM -0600, Jens Axboe wrote: > > I'll take a look today. I agree, this definitely looks like a fio > bug. But not related to the mutex issue for the stat part, all verifier > threads are waiting to be woken up, but the main thread is done. > I was taking a closer look at this, and it does look ike it's related to the stat_mutex. The main thread (according to gdb) seems to be stuck in this loop in backend.c line 1738 (in thread_main): do { check_update_rusage(td); if (!fio_mutex_down_trylock(stat_mutex)) break; usleep(1000); <----- line 1738 } while (1); So it looks like it's not able to grab the stat_mutex. But I can't figure out how the stat_mutex could be down. None of the strack traces seem to show that, and I've looked at all of the places where stat_mutex is taken, and it doesn't look like stat_mutex should ever be down for more than, say, a second? So as a temporary workaround, I'm considering adding a check to see if we stay stuck in this loop for than a thousand times, and if so, print an error to stderr and then call _exit(1), or maybe just break out two levels by jumping to line 1778 at "td_set_runstate(td, TD_FINISHING)" and just give up on the usage statistics (since for xfstests we really don't care about the usage stats). - Ted P.S. I can't see any way this could be happening other than perhaps a pointer error that corrupted stat_mutex. I can't see any way a thread could leave stat_mutex down WDYT? -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html