On Fri, Jan 08, 2016 at 11:44:21AM +0100, Jiri Kosina wrote: > On Tue, 5 Jan 2016, Dave Chinner wrote: > > > > kernel: Freezing of tasks failed after 20.006 seconds (2 tasks refusing to freeze, wq_bu_busy=0): > > > kernel: xfsaild/dm-1 S 0000000000014100 0 283 2 0x00000000 > > > kernel: ffff880213f53e10 ffffffff8180e4c0 ffff880213f05040 0000000000000000 > > > kernel: 0000000000000000 ffff880213f54000 0000000000000000 0000000000000000 > > > kernel: ffff8800ca389e40 ffff8800ca392000 ffff880213f53ed0 ffffffff814cd8ac > > > kernel: Call Trace: > > > kernel: [<ffffffff814cd8ac>] ? schedule+0x2c/0x70 > > > kernel: [<ffffffff8125e37d>] ? xfsaild+0x4fd/0x5b0 > > > kernel: [<ffffffff8125de80>] ? xfs_trans_ail_cursor_first+0x80/0x80 > > > kernel: [<ffffffff8125de80>] ? xfs_trans_ail_cursor_first+0x80/0x80 > > > kernel: [<ffffffff8108bd18>] ? kthread+0xb8/0xd0 > > > kernel: [<ffffffff8108bc60>] ? kthread_worker_fn+0x150/0x150 > > > kernel: [<ffffffff814d115f>] ? ret_from_fork+0x3f/0x70 > > > kernel: [<ffffffff8108bc60>] ? kthread_worker_fn+0x150/0x150 > > > kernel: xfsaild/sda1 S 0000000000014100 0 591 2 0x00000000 > > > kernel: ffff88021193be10 ffff8802159c4dc0 ffff880213ab9340 0000000000000000 > > > kernel: 0000000000000000 ffff88021193c000 0000000000000000 0000000000000000 > > > kernel: ffff8800ca2a4240 ffff880214eea000 ffff88021193bed0 ffffffff814cd8ac > > > kernel: Call Trace: > > > kernel: [<ffffffff814cd8ac>] ? schedule+0x2c/0x70 > > > kernel: [<ffffffff8125e37d>] ? xfsaild+0x4fd/0x5b0 > > > kernel: [<ffffffff8125de80>] ? xfs_trans_ail_cursor_first+0x80/0x80 > > > kernel: [<ffffffff8125de80>] ? xfs_trans_ail_cursor_first+0x80/0x80 > > > kernel: [<ffffffff8108bd18>] ? kthread+0xb8/0xd0 > > > kernel: [<ffffffff8108bc60>] ? kthread_worker_fn+0x150/0x150 > > > kernel: [<ffffffff814d115f>] ? ret_from_fork+0x3f/0x70 > > > kernel: [<ffffffff8108bc60>] ? kthread_worker_fn+0x150/0x150 > > > > > > Please tell me what more information you need to be able to fix this > > > issue. > > > > The freezer detection is broken. The thread is sleeping in schedule > > until a wakeup occurs some time in the future, which means it cannot > > "enter then freezer" because it's not a running thread. This is a > > problem introduced by commit 24ba16b ("xfs: clear PF_NOFREEZE for > > xfsaild kthread"). > > > > Jiri, I'm tempted just to revert this change - if the freezer > > doesn't detect processes that are not in TASK_RUNNABLE state as > > frozeni or can't mark them as frozen, then this change will never > > work reliably for XFS.... > > Well, clearly the thread is sleeping in schedule() during the freezing > operation and it's supposed to be doing so; therefore it doesn't need > explicit freezing point, right? No. It's sleeping in schedule because it's got nothing more to do - it's issued all it's IO and is idle. It is not going to run again until filesystem modification activity is restarted. But if the AIL still has objects in it (like it will after a sync), then it will continue to run and issue IO until it returns to the empty, idle state. In this active state, we need to freeze the thread on suspend so that it doesn't keep issuing IO all through the suspend process... > So the proper fix would rather be something like > > > > From: Jiri Kosina <jkosina@xxxxxxx> > Subject: [PATCH] xfs: xfsaild doesn't need to be freezable No, that just means we guarantee that there will be suspend image coherency problems when suspend is run on a busy filesystem... This process is not going to enter a runable state, so is never going to enter the freezer. But we can't be certain of that, because we haven't frozen the filesystem and hence it can still be modified and this thread could be woken and do stuff when it shouldn't. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs