Re: [PATCH RFC] mm readahead: Fix the readahead fail in case of empty numa node

Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> · Wed, 11 Dec 2013 15:05:22 -0800

On Wed, 11 Dec 2013 23:49:17 +0100 Jan Kara <jack@xxxxxxx> wrote:

> >  /*
> > - * Given a desired number of PAGE_CACHE_SIZE readahead pages, return a
> > - * sensible upper limit.
> > + * max_sane_readahead() is disabled.  It can later be removed altogether, but
> > + * let's keep a skeleton in place for now, in case disabling was the wrong call.
> >   */
> >  unsigned long max_sane_readahead(unsigned long nr)
> >  {
> > -	return min(nr, (node_page_state(numa_node_id(), NR_INACTIVE_FILE)
> > -		+ node_page_state(numa_node_id(), NR_FREE_PAGES)) / 2);
> > +	return nr;
> >  }
> >  
> >  /*
> > 
> > Can anyone see a problem with this?
>   Well, the downside seems to be that if userspace previously issued
> MADV/FADV_WILLNEED on a huge file, we trimmed the request to a sensible
> size. Now we try to read the whole huge file which is pretty much
> guaranteed to be useless (as we'll be pushing out of cache data we just
> read a while ago). And guessing the right readahead size from userspace
> isn't trivial so it would make WILLNEED advice less useful. What do you
> think?

OK, yes, there is conceivably a back-compatibility issue there.  There
indeed might be applications which decide the chuck the whole thing at
the kernel and let the kernel work out what is a sensible readahead
size to perform.

But I'm really struggling to think up an implementation!  The current
code looks only at the caller's node and doesn't seem to make much
sense.  Should we look at all nodes?  Hard to say without prior
knowledge of where those pages will be coming from.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>