Re: [PATCH 06/18] writeback: sync expired inodes first in background writeback

Jan Kara <jack@xxxxxxx> · Fri, 27 May 2011 01:10:45 +0200



On Wed 25-05-11 22:38:57, Wu Fengguang wrote:
> > and I was wondering: Assume there is one continuously redirtied file and
> > untar starts in parallel. With the new logic, background writeback will
> > never consider inodes that are not expired in this situation (we never
> > switch to "all dirty inodes" phase - or even if we switched, we would just
> > queue all inodes and then return back to queueing only expired inodes). So
> > the net effect is that for 30 seconds we will be only continuously writing
> > pages of the continuously dirtied file instead of (possibly older) pages of
> > other files that are written. Is this really desirable? Wasn't the old
> > behavior simpler and not worse than the new one?
> 
> Good question! Yes sadly in this case the new behavior could be worse
> than the old one.
> 
> In fact this patch do not improve the small files (< 4MB) case at all,
> except for the side effect that less unexpired inodes will leave in
> s_io when the background work quit and the later kupdate work will
> write less unexpired inodes.
> 
> And for the mixed small/large files case, it actually results in worse
> behavior on your mentioned case.
> 
> However the root cause here is the file being _actively_ written to,
> somehow a livelock scheme. We could add a simple livelock prevention
> scheme that works for the common case of file appending:
> 
> - save i_size when the range_cyclic writeback starts from 0, for
>   limiting the writeback scope
  Hmm, but for this we'd have to store additional 'unsigned long' (page
index) for each inode. Not sure if it's really worth it.

> - when range_cyclic writeback hits the saved i_size, quit the current
>   inode instead of immediately restarting from 0. This will not only
>   avoid a possible extra seek, but also redirty_tail() the inode and
>   hence get out of possible livelock.
  But I like the idea of doing redirty_tail() when we write out some inode
for too long. Maybe we could just do redirty_tail() instead of requeue_io()
whenever write_cache_pages() had to wrap the index? We could communicate
this by setting a flag in wbc in write_cache_pages()...

> The livelock prevention scheme may not only eliminate the undesirable
> behavior you observed for this patch, but also prevent the "some old
> pages may not get the chance to get written to disk in an actively
> dirtied file" data security issue discussed in an old email. What do
> you think?
  So my scheme would not solve this but it does not require per-inode
overhead...

								Honza
-- 
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html