On 9/21/21 3:35 PM, Dave Chinner wrote: > On Tue, Sep 21, 2021 at 08:19:53AM -0600, Jens Axboe wrote: >> On 9/21/21 7:25 AM, Jens Axboe wrote: >>> On 9/21/21 12:40 AM, Dave Chinner wrote: >>>> Hi Jens, >>>> >>>> I updated all my trees from 5.14 to 5.15-rc2 this morning and >>>> immediately had problems running the recoveryloop fstest group on >>>> them. These tests have a typical pattern of "run load in the >>>> background, shutdown the filesystem, kill load, unmount and test >>>> recovery". >>>> >>>> Whent eh load includes fsstress, and it gets killed after shutdown, >>>> it hangs on exit like so: >>>> >>>> # echo w > /proc/sysrq-trigger >>>> [ 370.669482] sysrq: Show Blocked State >>>> [ 370.671732] task:fsstress state:D stack:11088 pid: 9619 ppid: 9615 flags:0x00000000 >>>> [ 370.675870] Call Trace: >>>> [ 370.677067] __schedule+0x310/0x9f0 >>>> [ 370.678564] schedule+0x67/0xe0 >>>> [ 370.679545] schedule_timeout+0x114/0x160 >>>> [ 370.682002] __wait_for_common+0xc0/0x160 >>>> [ 370.684274] wait_for_completion+0x24/0x30 >>>> [ 370.685471] do_coredump+0x202/0x1150 >>>> [ 370.690270] get_signal+0x4c2/0x900 >>>> [ 370.691305] arch_do_signal_or_restart+0x106/0x7a0 >>>> [ 370.693888] exit_to_user_mode_prepare+0xfb/0x1d0 >>>> [ 370.695241] syscall_exit_to_user_mode+0x17/0x40 >>>> [ 370.696572] do_syscall_64+0x42/0x80 >>>> [ 370.697620] entry_SYSCALL_64_after_hwframe+0x44/0xae >>>> >>>> It's 100% reproducable on one of my test machines, but only one of >>>> them. That one machine is running fstests on pmem, so it has >>>> synchronous storage. Every other test machine using normal async >>>> storage (nvme, iscsi, etc) and none of them are hanging. >>>> >>>> A quick troll of the commit history between 5.14 and 5.15-rc2 >>>> indicates a couple of potential candidates. The 5th kernel build >>>> (instead of ~16 for a bisect) told me that commit 15e20db2e0ce >>>> ("io-wq: only exit on fatal signals") is the cause of the >>>> regression. I've confirmed that this is the first commit where the >>>> problem shows up. >>> >>> Thanks for the report Dave, I'll take a look. Can you elaborate on >>> exactly what is being run? And when killed, it's a non-fatal signal? > > It's whatever kill/killall sends by default. Typical behaviour that > causes a hang is something like: > > $FSSTRESS_PROG -n10000000 -p $PROCS -d $load_dir >> $seqres.full 2>&1 & > .... > sleep 5 > _scratch_shutdown > $KILLALL_PROG -q $FSSTRESS_PROG > wait > > _shutdown_scratch is typically just an 'xfs_io -rx -c "shutdown" > /mnt/scratch' command that shuts down the filesystem. Other tests in > the recoveryloop group use DM targets to fail IO that trigger a > shutdown, others inject errors that trigger shutdowns, etc. But the > result is that all hang waiting for fsstress processes that have > been using io_uring to exit. > > Just run fstests with "./check -g recoveryloop" - there's only a > handful of tests and it only takes about 5 minutes to run them all > on a fake DRAM based pmem device.. I made a trivial reproducer just to verify. >> Can you try with this patch? >> >> diff --git a/fs/io-wq.c b/fs/io-wq.c >> index b5fd015268d7..1e55a0a2a217 100644 >> --- a/fs/io-wq.c >> +++ b/fs/io-wq.c >> @@ -586,7 +586,8 @@ static int io_wqe_worker(void *data) >> >> if (!get_signal(&ksig)) >> continue; >> - if (fatal_signal_pending(current)) >> + if (fatal_signal_pending(current) || >> + signal_group_exit(current->signal)) { >> break; >> continue; >> } > > Cleaned up so it compiles and the tests run properly again. But > playing whack-a-mole with signals seems kinda fragile. I was pointed > to this patchset by another dev on #xfs overnight who saw the same > hangs that also fixed the hang: It seems sane to me - exit if there's a fatal signal, or doing core dump. Don't think there should be other conditions. > https://lore.kernel.org/lkml/cover.1629655338.git.olivier@xxxxxxxxxxxxxx/ > > It was posted about a month ago and I don't see any response to it > on the lists... That's been a long discussion, but it's a different topic really. Yes it's signals, but it's not this particular issue. It'll happen to work around this issue, as it cancels everything post core dumping. -- Jens Axboe