On Tue, Jan 5, 2010 at 11:16 AM, Wu Fengguang <fengguang.wu@xxxxxxxxx> wrote: > On Tue, Jan 05, 2010 at 09:46:09AM +0800, Minchan Kim wrote: >> On Mon, Jan 4, 2010 at 9:16 PM, Wu Fengguang <fengguang.wu@xxxxxxxxx> wrote: >> > Hi Minchan, >> > >> > On Mon, Jan 04, 2010 at 01:20:49PM +0800, Minchan Kim wrote: >> >> > --- linux.orig/mm/readahead.c 2010-01-04 12:39:29.000000000 +0800 >> >> > +++ linux/mm/readahead.c 2010-01-04 12:39:30.000000000 +0800 >> >> > @@ -501,6 +501,12 @@ void page_cache_sync_readahead(struct ad >> >> > if (!ra->ra_pages) >> >> > return; >> >> > >> >> > + /* be dumb */ >> >> > + if (filp->f_flags & O_RANDOM) { >> >> > + force_page_cache_readahead(mapping, filp, offset, req_size); >> >> > + return; >> >> > + } >> >> > + >> >> >> >> Let me have a dumb question. :) >> >> >> >> How about testing O_RANDOM in front of ra_pages testing? >> >> >> >> My intention is that although we turn off ra, it would be better to read >> >> contiguous block all at once than readpage() callback doing I/O >> >> one page at a time. >> >> >> >> Is it break some semantics or happen some problem in ondemand readahead? >> > >> > Yes it will have some problem with shrink_readahead_size_eio(), which >> > want to disable readahead and use ->readpage() when ra_pages==0. >> > >> > Do you have specific use case in mind? The file systems that set >> > ra_pages=0 seems to don't need readahead, too. >> >> Never mind. It's just out of curiosity. :) >> >> I thought although user disable readahead, we could enhance file I/O >> with one readpages not multiple readpage if we know the user want to >> read big contiguous blocks. > > Yes, not-break-large-read-into-pages would be good for HD/SSD drives > when readahead is disabled. > > Currently, ->ra_pages is somehow overloaded in its ==0 case. As you > said, it's in fact possible to disable readahead while still limiting > read IO size to a non-zero ->ra_pages. > >> But I though it break current readahead off semantics. right? > > It can be done by applying the ->ra_pages limit to O_RANDOM. This also > makes O_RANDOM safer to use: > > @@ -497,6 +497,13 @@ void page_cache_sync_readahead(struct ad > struct file_ra_state *ra, struct file *filp, > pgoff_t offset, unsigned long req_size) > { > + /* be dumb */ > + if (filp->f_flags & O_RANDOM) { > + req_size = clamp_t(unsigned long, req_size, 1, ra->ra_pages); > + force_page_cache_readahead(mapping, filp, offset, req_size); > + return; > + } > + > /* no read-ahead */ > if (!ra->ra_pages) > return; > > To make real change, we need an interface for the user to disable > whole-partition readahead by setting O_RANDOM instead of ra_pages=0. > That would be a hard sell.. Okay. I understand. Thanks, Wu. -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html