Hi Johannes, Yes, problem was as you projected. I tried to make "active" data-2 pages by manually reading them twice, and finally data-1 are got out of page cache. We have large files in PostgreSQL and Hadoop that we sequentially scan over; and try to fit our working set into total memory. So I hope your patches will take place in the soonest linux kernel version. Thanks, Metin ----- Original Message ----- From: Johannes Weiner <hannes@xxxxxxxxxxx> To: Jaegeuk Hanse <jaegeuk.hanse@xxxxxxxxx> Cc: Jan Kara <jack@xxxxxxx>; metin d <metdos@xxxxxxxxx>; "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>; linux-mm@xxxxxxxxx Sent: Thursday, November 22, 2012 3:09 AM Subject: Re: Problem in Page Cache Replacement On Thu, Nov 22, 2012 at 08:48:07AM +0800, Jaegeuk Hanse wrote: > On 11/22/2012 05:34 AM, Johannes Weiner wrote: > >Hi, > > > >On Tue, Nov 20, 2012 at 07:25:00PM +0100, Jan Kara wrote: > >>On Tue 20-11-12 09:42:42, metin d wrote: > >>>I have two PostgreSQL databases named data-1 and data-2 that sit on the > >>>same machine. Both databases keep 40 GB of data, and the total memory > >>>available on the machine is 68GB. > >>> > >>>I started data-1 and data-2, and ran several queries to go over all their > >>>data. Then, I shut down data-1 and kept issuing queries against data-2. > >>>For some reason, the OS still holds on to large parts of data-1's pages > >>>in its page cache, and reserves about 35 GB of RAM to data-2's files. As > >>>a result, my queries on data-2 keep hitting disk. > >>> > >>>I'm checking page cache usage with fincore. When I run a table scan query > >>>against data-2, I see that data-2's pages get evicted and put back into > >>>the cache in a round-robin manner. Nothing happens to data-1's pages, > >>>although they haven't been touched for days. > >>> > >>>Does anybody know why data-1's pages aren't evicted from the page cache? > >>>I'm open to all kind of suggestions you think it might relate to problem. > >This might be because we do not deactive pages as long as there is > >cache on the inactive list. I'm guessing that the inter-reference > >distance of data-2 is bigger than half of memory, so it's never > >getting activated and data-1 is never challenged. > > Hi Johannes, > > What's the meaning of "inter-reference distance" It's the number of memory accesses between two accesses to the same page: A B C D A B C E ... |_______| | | > and why compare it with half of memoy, what's the trick? If B gets accessed twice, it gets activated. If it gets evicted in between, the second access will be a fresh page fault and B will not be recognized as frequently used. Our cutoff for scanning the active list is cache size / 2 right now (inactive_file_is_low), leaving 50% of memory to the inactive list. If the inter-reference distance for pages on the inactive list is bigger than that, they get evicted before their second access. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href