On Fri 06-05-11 11:08:23, Wu Fengguang wrote: > With the more aggressive "keep writeback as long as we wrote something" > logic in wb_writeback(), the "use LONG_MAX .nr_to_write" trick in commit > b9543dac5bbc ("writeback: avoid livelocking WB_SYNC_ALL writeback") is > no longer enough to stop sync livelock. > > The fix is to explicitly update .dirtied_when on synced inodes, so that > they are no longer considered for writeback in the next round. The changelog doesn't make sense now when the patch is in the second place in the series... Also as we discussed in the previous iteration of the patches, I thought you'll move the condition before mapping_tagged() test. I.e. the code would be (comment updated): if (!(inode->i_state & I_FREEING)) { /* * Sync livelock prevention: Each inode is tagged and * synced in one shot. So we can unconditionally update its * dirty time to prevent syncing it again. Note that time * ordering of b_dirty list will be kept because the * following code either removes the inode from b_dirty * or calls redirty_tail(). */ if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_sync) inode->dirtied_when = jiffies; if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) { ... Honza > > CC: Jan Kara <jack@xxxxxxx> > Signed-off-by: Wu Fengguang <fengguang.wu@xxxxxxxxx> > --- > fs/fs-writeback.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > ext3/ext4 are working fine now, however tests show that XFS may still > livelock inside the XFS routines: > > [ 3581.181253] sync D ffff8800b6ca15d8 4560 4403 4392 0x00000000 > [ 3581.181734] ffff88006f775bc8 0000000000000046 ffff8800b6ca12b8 00000001b6ca1938 > [ 3581.182411] ffff88006f774000 00000000001d2e40 00000000001d2e40 ffff8800b6ca1280 > [ 3581.183088] 00000000001d2e40 ffff88006f775fd8 00000340af111ef2 00000000001d2e40 > [ 3581.183765] Call Trace: > [ 3581.184008] [<ffffffff8109be73>] ? lock_release_holdtime+0xa3/0xab > [ 3581.184392] [<ffffffff8108cc0d>] ? prepare_to_wait+0x6c/0x79 > [ 3581.184756] [<ffffffff8108cc0d>] ? prepare_to_wait+0x6c/0x79 > [ 3581.185120] [<ffffffff812ed520>] xfs_ioend_wait+0x87/0x9f > [ 3581.185474] [<ffffffff8108c97a>] ? wake_up_bit+0x2a/0x2a > [ 3581.185827] [<ffffffff812f742a>] xfs_sync_inode_data+0x92/0x9d > [ 3581.186198] [<ffffffff812f76e2>] xfs_inode_ag_walk+0x1a5/0x287 > [ 3581.186569] [<ffffffff812f779b>] ? xfs_inode_ag_walk+0x25e/0x287 > [ 3581.186946] [<ffffffff812f7398>] ? xfs_sync_worker+0x69/0x69 > [ 3581.187311] [<ffffffff812e2354>] ? xfs_perag_get+0x68/0xd0 > [ 3581.187669] [<ffffffff81092175>] ? local_clock+0x41/0x5a > [ 3581.188020] [<ffffffff8109be73>] ? lock_release_holdtime+0xa3/0xab > [ 3581.188403] [<ffffffff812e22ec>] ? xfs_check_sizes+0x160/0x160 > [ 3581.188773] [<ffffffff812e2354>] ? xfs_perag_get+0x68/0xd0 > [ 3581.189130] [<ffffffff812e236c>] ? xfs_perag_get+0x80/0xd0 > [ 3581.189488] [<ffffffff812e22ec>] ? xfs_check_sizes+0x160/0x160 > [ 3581.189858] [<ffffffff812f7831>] ? xfs_inode_ag_iterator+0x6d/0x8f > [ 3581.190241] [<ffffffff812f7398>] ? xfs_sync_worker+0x69/0x69 > [ 3581.190606] [<ffffffff812f780b>] xfs_inode_ag_iterator+0x47/0x8f > [ 3581.190982] [<ffffffff811611f5>] ? __sync_filesystem+0x7a/0x7a > [ 3581.191352] [<ffffffff812f7877>] xfs_sync_data+0x24/0x43 > [ 3581.191703] [<ffffffff812f7911>] xfs_quiesce_data+0x2c/0x88 > [ 3581.192065] [<ffffffff812f5556>] xfs_fs_sync_fs+0x21/0x48 > [ 3581.192419] [<ffffffff811611e1>] __sync_filesystem+0x66/0x7a > [ 3581.192783] [<ffffffff8116120b>] sync_one_sb+0x16/0x18 > [ 3581.193128] [<ffffffff8113e3e3>] iterate_supers+0x72/0xce > [ 3581.193482] [<ffffffff81161140>] sync_filesystems+0x20/0x22 > [ 3581.193842] [<ffffffff8116127e>] sys_sync+0x21/0x33 > [ 3581.194177] [<ffffffff819016c2>] system_call_fastpath+0x16/0x1b > > --- linux-next.orig/fs/fs-writeback.c 2011-05-05 23:30:22.000000000 +0800 > +++ linux-next/fs/fs-writeback.c 2011-05-05 23:30:23.000000000 +0800 > @@ -432,6 +432,15 @@ writeback_single_inode(struct inode *ino > requeue_io(inode); > } else { > /* > + * sync livelock prevention: each inode is > + * tagged and synced in one shot. If still > + * dirty, move it back to s_dirty with updated > + * dirty time to prevent syncing it again. > + */ > + if (wbc->sync_mode == WB_SYNC_ALL || > + wbc->tagged_sync) > + inode->dirtied_when = jiffies; > + /* > * Writeback blocked by something other than > * congestion. Delay the inode for some time to > * avoid spinning on the CPU (100% iowait) > > -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html