On Wed, Oct 07, 2009 at 07:14:54PM +0800, Wu Fengguang wrote: > On Wed, Oct 07, 2009 at 06:38:37PM +0800, Nikita Danilov wrote: > > Hello, > > > > > When pageout a dirty page, try to piggy back more consecutive dirty > > > pages (up to 512KB) to improve IO efficiency. > > > > > > Only ext3/reiserfs which don't have its own aops->writepages are > > > supported in this initial version. > > > > > > > [...] > > > > > /* > > > + * only write_cache_pages() supports for_reclaim for now > > > + */ > > > + if (!mapping->a_ops->writepages) { > > > + wbc.range_start = (page->index + 1) << PAGE_CACHE_SHIFT; > > > + wbc.nr_to_write = LUMPY_PAGEOUT_PAGES - 1; > > > + generic_writepages(mapping, &wbc); > > > + } > > > > This might end up calling ->writepage() on a page_mapped() page, > > which used to be a problem, at least for shmem (see > > BUG_ON(page_mapped(page)) in shmem_writepage()). > > Good catch, thanks a lot! You also have a problem in that you aren't allowed to touch mapping after the page is unlocked. So this patch looks buggy. I didn't see where Nikita mentions the inode reclaim race, but this is it. It seems tempting to do this lumpy writeout here, but... hmm, I don't know. I would like to see workloads where it matters now, and then I would like to know why normal writeout doesn't keep up. Then... maybe it's enough simply to allow the filesystem to do their own clustering because they can tell when writepage is called for reclaim. But if you have real cases where this helps, it would be very interesting. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html