On Wed, Apr 01, 2009 at 08:48:43PM +0800, Jeff Layton wrote: > > > > --- mm.orig/fs/fs-writeback.c > > > > +++ mm/fs/fs-writeback.c > > > > @@ -196,7 +196,7 @@ static void redirty_tail(struct inode *i > > > > struct inode *tail_inode; > > > > > > > > tail_inode = list_entry(sb->s_dirty.next, struct inode, i_list); > > > > - if (!time_after_eq(inode->dirtied_when, > > > > + if (time_before(inode->dirtied_when, > > > > tail_inode->dirtied_when)) > > > > inode->dirtied_when = jiffies; > > > > } > > > > > > I think we need a similar change in this function in order to maintain > > > the list order. > > > > > > Consider this case: > > > > > > We have an s_dirty list with a head inode that appears to be in the > > > future. We start writeback and clear out s_dirty (all of the inodes are > > > moved to s_io). A new inode is dirtied, and goes onto the empty s_dirty > > > list with a dirtied_when value that equals now. The inode with the > > > dirtied_when value that looks like it's in the future is redirtied while > > > being written and redirty_tail is called. It goes back on the list > > > without resetting dirtied_when even though it's actually older than the > > > inode at the tail. > > > > What's the difference? It _is_ the past because all 2 reference sites > > are now taught to think so. > > > > So s_dirty is still in order, and the writeback process won't be blocked. > > > > Sanity check -- my understanding is this: > > head == least-recently dirtied inode > tail == most-recently dirtied inode > > ...if so, then we are violating the list order if we don't make a > change to redirty_tail. We're putting an inode that's far in the past > back onto the tail of the list without resetting dirtied_when. A more > recently-dirtied inode will precede one that was dirtied less recently. > > Since the newly dirtied inode is closer to the head of the list, the > older inode that's constantly being redirtied won't be written out > until the newly dirtied one passes the older_than_this check (30s or > so in the usual case). If you call that out-of-order, yes it is. Sadly it cannot be improved by playing with dirtied_when: the _physical_ order is still the same. You know what? That's exactly the drawback of redirtying into s_dirty. It's irrelevant to the resetting of dirtied_when. A new s_more_io_wait is the only way to solve this problem. > > > There is another option too that I'll throw out here... > > > > > > We could just make dirtied_when a 64 bit value on 32 bit machines and > > > use jiffies_64 there. On the upside there is no "problematic > > > window" with that. The downside is that struct inode would grow by 4 > > > bytes on 32 bit arches, and checking jiffies_64 on such an arch is > > > more computationally intensive. We'd also have to change the size of > > > older_than_this value in the writeback_control struct too if we want to > > > go this route... > > > > Yes that could eliminate the 30s or more temporary writeback stillness. > > The only problem is the extra costs for normal cases, especially the > > space cost. > > > > Correct. I'm not necessarily advocating that approach but it's one to > consider... > > If your s_more_io_wait patchset comes to fruition though then that > change really won't be needed, so maybe it's best not to go that route. Sorry for the delay. But I'm still curious about the redirty process&timing of NFS/XFS. The conflicts with Jens' per-bdi pdflush patches are another concern... Thanks, Fengguang -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html