On Monday, March 27, 2017 01:46:07 PM Darrick J. Wong wrote: > [cc linux-pm since this intersects with suspend...] > > On Sat, Feb 04, 2017 at 09:31:27AM +1100, Dave Chinner wrote: > > On Thu, Feb 02, 2017 at 05:04:01PM -0800, Darrick J. Wong wrote: > > > Hi list, > > > > > > So I've noticed that my laptop consistently fails to suspend with: > > > > > > [1183625.726800] atkbd serio0: Unknown key pressed (translated set 2, code 0xd8 on isa0060/serio0). > > > [1183625.726804] atkbd serio0: Use 'setkeycodes e058 <keycode>' to make it known. > > > [1183625.727492] atkbd serio0: Unknown key released (translated set 2, code 0xd8 on isa0060/serio0). > > > [1183625.727497] atkbd serio0: Use 'setkeycodes e058 <keycode>' to make it known. > > > [1183626.203928] e1000e: enp0s25 NIC Link is Down > > > [1183626.422720] PM: Syncing filesystems ... done. > > > [1183626.450348] Freezing user space processes ... (elapsed 0.002 seconds) done. > > > [1183626.452995] Freezing remaining freezable tasks ... > > > [1183632.657243] atkbd serio0: Unknown key pressed (translated set 2, code 0xd9 on isa0060/serio0). > > > [1183632.657247] atkbd serio0: Use 'setkeycodes e059 <keycode>' to make it known. > > > [1183632.657814] atkbd serio0: Unknown key released (translated set 2, code 0xd9 on isa0060/serio0). > > > [1183632.657817] atkbd serio0: Use 'setkeycodes e059 <keycode>' to make it known. > > > [1183646.459310] Freezing of tasks failed after 20.006 seconds (1 tasks refusing to freeze, wq_busy=0): > > > [1183646.459348] xfsaild/dm-1 D 0 1767 2 0x00000000 > > > > Yes, this can happen because suspend thinks that "sync" is > > sufficient to quiesce a filesystem into an idle state. > > > > > [1183646.459366] Call Trace: > > > [1183646.459386] [<ffffffffb5a43b8d>] schedule+0x3d/0x90 > > > [1183646.459390] [<ffffffffb5a47339>] schedule_timeout+0x239/0x420 > > > [1183646.459401] [<ffffffffb5a450e6>] wait_for_completion+0xa6/0x120 > > > [1183646.459460] [<ffffffffb539ba0f>] xfs_buf_submit_wait+0x7f/0x280 > > > [1183646.459466] [<ffffffffb539bc33>] _xfs_buf_read+0x23/0x30 > > > [1183646.459470] [<ffffffffb539bd64>] xfs_buf_read_map+0x124/0x1b0 > > > [1183646.459473] [<ffffffffb53eb270>] xfs_trans_read_buf_map+0x110/0x370 > > > [1183646.459478] [<ffffffffb538417e>] xfs_imap_to_bp+0x6e/0xe0 > > > [1183646.459481] [<ffffffffb53b3883>] xfs_iflush+0xd3/0x230 > > > [1183646.459486] [<ffffffffb53e0ab4>] xfs_inode_item_push+0xf4/0x150 > > > [1183646.459489] [<ffffffffb53e9cdf>] xfsaild+0x2df/0x740 > > > [1183646.459500] [<ffffffffb51101f9>] kthread+0xd9/0xf0 > > > > That's inode writeback when the underlying inode buffer has been > > reclaimed before the dirty cached inode has been written. So the > > xfsaild is doing read/modify/write cycles to write back dirty > > inodes. i.e. you're running in active memory reclaim conditions > > prior to suspend... > > So I wrote up a patch that removes WQ_FREEZABLE from the xfs_buf thread, > and since then I haven't had any problems suspending my laptop. Last > week at LSF I inquired about whether it was proper to be freezing IO > helper threads as part of suspend, and was told in response "Are you > convinced that use of WQ_FREEZABLE is even correct?" TBH I can't see > why you'd want to freeze IO helper workqueues at all. > > So, I'm going to email that patch out as an RFC and if anyone wants to > follow up the discussion, let's do it there. Yes, please! > I get it, suspend really > should just fsfreeze, but the question I really want to know is, why > does XFS freeze its own threads? They seem to go to sleep just fine > after we're done doing all the IO we want. That, quite frankly, is what I would expect. > > > ISTR Dave or someone grumbling about this being some artifact of the log > > > trying to read in some buffer or other as part of flushing the log prior > > > to suspend, but the io completion ends up tied to a workqueue that's > > > already been put to sleep, so xfs gets stuck forever. > > > > Yup, suspend is just completely fucked, has been for more than 10 > > years. It needs to freeze filesystems so they are quiesced sanely, > > not left to run while random parts of the kernel infrastructure they > > rely on are shut down behind the filesystem's back. > > > > > Look familiar to anyone before I try to debug this tomorrow? > > > > See this as a recent starting point. > > > > https://lwn.net/Articles/705269/ > > I wonder if they've done any work on freezing filesystems... Not that I know of. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html