On 01/15/2014 03:16 AM, Johannes Weiner wrote: > On Tue, Jan 14, 2014 at 09:01:09AM +0800, Bob Liu wrote: >> Hi Johannes, >> >> On 01/11/2014 02:10 AM, Johannes Weiner wrote: >>> The VM maintains cached filesystem pages on two types of lists. One >>> list holds the pages recently faulted into the cache, the other list >>> holds pages that have been referenced repeatedly on that first list. >>> The idea is to prefer reclaiming young pages over those that have >>> shown to benefit from caching in the past. We call the recently used >>> list "inactive list" and the frequently used list "active list". >>> >>> Currently, the VM aims for a 1:1 ratio between the lists, which is the >>> "perfect" trade-off between the ability to *protect* frequently used >>> pages and the ability to *detect* frequently used pages. This means >>> that working set changes bigger than half of cache memory go >>> undetected and thrash indefinitely, whereas working sets bigger than >>> half of cache memory are unprotected against used-once streams that >>> don't even need caching. >>> >> >> Good job! This patch looks good to me and with nice descriptions. >> But it seems that this patch only fix the issue "working set changes >> bigger than half of cache memory go undetected and thrash indefinitely". >> My concern is could it be extended easily to address all other issues >> based on this patch set? >> >> The other possible way is something like Peter has implemented the CART >> and Clock-Pro which I think may be better because of using advanced >> algorithms and consider the problem as a whole from the beginning.(Sorry >> I haven't get enough time to read the source code, so I'm not 100% sure.) >> http://linux-mm.org/PeterZClockPro2 > > My patches are moving the VM towards something that is comparable to > how Peter implemented Clock-Pro. However, the current VM has evolved > over time in small increments based on real life performance > observations. Rewriting everything in one go would be incredibly > disruptive and I doubt very much we would merge any such proposal in > the first place. So it's not like I don't see the big picture, it's > just divide and conquer: > > Peter's Clock-Pro implementation was basically a double clock with an > intricate system to classify hotness, augmented by eviction > information to work with reuse distances independent of memory size. > > What we have right now is a double clock with a very rudimentary > system to classify whether a page is hot: it has been accessed twice > while on the inactive clock. My patches now add eviction information > to this, and improve the classification so that it can work with reuse > distances up to memory size and is no longer dependent on the inactive > clock size. > > This is the smallest imaginable step that is still useful, and even > then we had a lot of discussions about scalability of the data > structures and confusion about how the new data point should be > interpreted. It also took a long time until somebody read the series > and went, "Ok, this actually makes sense to me." Now, maybe I suck at > documenting, but maybe this is just complicated stuff. Either way, we > have to get there collectively, so that the code is maintainable in > the long term. > > Once we have these new concepts established, we can further improve > the hotness detector so that it can classify and order pages with > reuse distances beyond memory size. But this will come with its own > set of problems. For example, some time ago we stopped regularly > scanning and rotating active pages because of scalability issues, but > we'll most likely need an uptodate estimate of the reuse distances on > the active list in order to classify refaults properly. > Thank you for your kindly explanation. It make sense to me please feel free to add my review. >>> + * Approximating inactive page access frequency - Observations: >>> + * >>> + * 1. When a page is accessed for the first time, it is added to the >>> + * head of the inactive list, slides every existing inactive page >>> + * towards the tail by one slot, and pushes the current tail page >>> + * out of memory. >>> + * >>> + * 2. When a page is accessed for the second time, it is promoted to >>> + * the active list, shrinking the inactive list by one slot. This >>> + * also slides all inactive pages that were faulted into the cache >>> + * more recently than the activated page towards the tail of the >>> + * inactive list. >>> + * >> >> Nitpick, how about the reference bit? > > What do you mean? > Sorry, I mean the PG_referenced flag. I thought when a page is accessed for the second time only PG_referenced flag will be set instead of be promoted to active list. -- Regards, -Bob -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html