Re: [PATCH] mm: implement POSIX_FADV_NOREUSE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/12/2014 04:59 AM, Lukas Senger wrote:
>> This also looks to ignore the reuse flag for existing pages.  Have you
>> thought about what the semantics should be there?
> 
> The idea is to only treat the pages special when they are first read
> from disk. This way we achieve the main goal of not displacing useful
> cache content.
> 
>> Also, *should* readahead pages really have this flag set?  If a very
>> important page gets brought in via readahead, doesn't this put it at a
>> disadvantage for getting aged out?
> 
> If the flag is not set on readahead pages, the advise barely has any
> effect at all, since most of the file gets read through readahead. Of
> course that very important page has a disadvantage at the beginning, but
> as soon as it has been moved into the active list the NOREUSE doesn't
> affect it anymore. Worst case it gets read once more without the flag.

That's a good point, and it's a much more important change to the
existing code than the fadvise bits are.  Probably best to make a bigger
deal about it in the patch description.

> On Tue, 2014-03-11 at 14:27 -0700, Andrew Morton wrote:
>> And it sets PG_noreuse on new pages whether or not they were within the
>> fadvise range (offset...offset+len).  It's not really an fadvise
>> operation at all.
> 
> NORMAL, SEQUENTIAL and RANDOM don't honor the range either. So we
> figured it would be ok to do so for the sake of keeping the
> implementation simple.
> 
>>> page flags are really scarce and I am not sure this is the best
>> usage of
>>> the few remaining slots.
>>
>> Yeah, especially since the use so so transient.  I can see why using a
>> flag is nice for a quick prototype, but this is a far cry from needing
>> one. :)  You might be able to reuse a bit like PageReadahead.  You
>> could
>> probably also use a bit in the page pointer of the lruvec, or even
>> have
>> a percpu variable that stores a pointer to the 'struct page' you want
>> to
>> mark as NOREUSE.
> 
> Ok, we understand that we can't add a page flag. We tried to find a flag
> to recycle but did not succeed. lruvec doesn't have page pointers and we
> don't have access to a pagevec and the file struct at the same time. We
> don't really understand the last suggestion, as we need to save this
> information for more than one page and going over a list every time we
> add something to an lru list doesn't seem like a good idea.

Yeah, you're right.  I was ignoring the readahead code here.

But, why wouldn't this work there?  Define a percpu variable, and assign
it to the target page in readahead's read_pages() and in
do_generic_file_read() which deal with pages one at a time and not in lists.

struct page *read_me_once;
void hint_page_read_once(struct page *page)
{
	read_me_once = page;
}

Then check for (read_me_once == page) in add_page_to_lru_list() instead
of the page flag.  Then, make read_me_once per-cpu.  This won't be
preempt safe, but we're talking about readahead and hints here, so we
can probably just bail in the cases where we race.

> Would it be acceptable to add a member to struct page for our purpose?

'struct page' must be aligned to two pointers due to constraints from
the slub allocator.  Adding a single byte to it would bloat it by 16
bytes for me, which translates in to 2GB of lost space on my 1TB system.
 There are 6TB systems out there today which would lose 12GB.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]