On Fri 22-03-13 11:04:49, Michal Hocko wrote: > On Fri 22-03-13 08:37:04, Mel Gorman wrote: > > On Fri, Mar 22, 2013 at 08:54:27AM +0100, Michal Hocko wrote: > > > On Thu 21-03-13 15:34:42, Mel Gorman wrote: > > > > On Thu, Mar 21, 2013 at 04:07:55PM +0100, Michal Hocko wrote: > > > > > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > > > > > > index 4835a7a..182ff15 100644 > > > > > > > > --- a/mm/vmscan.c > > > > > > > > +++ b/mm/vmscan.c > > > > > > > > @@ -1815,6 +1815,45 @@ out: > > > > > > > > } > > > > > > > > } > > > > > > > > > > > > > > > > +static void recalculate_scan_count(unsigned long nr_reclaimed, > > > > > > > > + unsigned long nr_to_reclaim, > > > > > > > > + unsigned long nr[NR_LRU_LISTS]) > > > > > > > > +{ > > > > > > > > + enum lru_list l; > > > > > > > > + > > > > > > > > + /* > > > > > > > > + * For direct reclaim, reclaim the number of pages requested. Less > > > > > > > > + * care is taken to ensure that scanning for each LRU is properly > > > > > > > > + * proportional. This is unfortunate and is improper aging but > > > > > > > > + * minimises the amount of time a process is stalled. > > > > > > > > + */ > > > > > > > > + if (!current_is_kswapd()) { > > > > > > > > + if (nr_reclaimed >= nr_to_reclaim) { > > > > > > > > + for_each_evictable_lru(l) > > > > > > > > + nr[l] = 0; > > > > > > > > + } > > > > > > > > + return; > > > > > > > > > > > > > > Heh, this is nicely cryptically said what could be done in shrink_lruvec > > > > > > > as > > > > > > > if (!current_is_kswapd()) { > > > > > > > if (nr_reclaimed >= nr_to_reclaim) > > > > > > > break; > > > > > > > } > > > > > > > > > > > > > > > > > > > Pretty much. At one point during development, this function was more > > > > > > complex and it evolved into this without me rechecking if splitting it > > > > > > out still made sense. > > > > > > > > > > > > > Besides that this is not memcg aware which I think it would break > > > > > > > targeted reclaim which is kind of direct reclaim but it still would be > > > > > > > good to stay proportional because it starts with DEF_PRIORITY. > > > > > > > > > > > > > > > > > > > This does break memcg because it's a special sort of direct reclaim. > > > > > > > > > > > > > I would suggest moving this back to shrink_lruvec and update the test as > > > > > > > follows: > > > > > > > > > > > > I also noticed that we check whether the scan counts need to be > > > > > > normalised more than once > > > > > > > > > > I didn't mind this because it "disqualified" at least one LRU every > > > > > round which sounds reasonable to me because all LRUs would be scanned > > > > > proportionally. > > > > > > > > Once the scan count for one LRU is 0 then min will always be 0 and no > > > > further adjustment is made. It's just redundant to check again. > > > > > > Hmm, I was almost sure I wrote that min should be adjusted only if it is >0 > > > in the first loop but it is not there... > > > > > > So for real this time. > > > for_each_evictable_lru(l) > > > if (nr[l] && nr[l] < min) > > > min = nr[l]; > > > > > > This should work, no? Everytime you shrink all LRUs you and you have > > > reclaimed enough already you get the smallest LRU out of game. This > > > should keep proportions evenly. > > > > Lets say we started like this > > > > LRU_INACTIVE_ANON 60 > > LRU_ACTIVE_FILE 1000 > > LRU_INACTIVE_FILE 3000 > > > > and we've reclaimed nr_to_reclaim pages then we recalculate the number > > of pages to scan from each list as; > > > > LRU_INACTIVE_ANON 0 > > LRU_ACTIVE_FILE 940 > > LRU_INACTIVE_FILE 2940 > > > > We then shrink SWAP_CLUSTER_MAX from each LRU giving us this. > > > > LRU_INACTIVE_ANON 0 > > LRU_ACTIVE_FILE 908 > > LRU_INACTIVE_FILE 2908 > > > > Then under your suggestion this would be recalculated as > > > > LRU_INACTIVE_ANON 0 > > LRU_ACTIVE_FILE 0 > > LRU_INACTIVE_FILE 2000 > > > > another SWAP_CLUSTER_MAX reclaims and then it stops we stop reclaiming. I > > might still be missing the point of your suggestion but I do not think it > > would preserve the proportion of pages we reclaim from the anon or file LRUs. > > It wouldn't preserve proportion precisely because each reclaim round is > in SWAP_CLUSTER_MAX units but it would reclaim bigger lists more than > smaller ones which I thought was the whole point. So yes using word > "proportionally" is unfortunate but I didn't find out better one. OK, I have obviosly missed that you are not breaking out of the loop if scan_adjusted. Now that I am looking at the updated patch again you just do if (nr_reclaimed < nr_to_reclaim || scan_adjusted) continue; So I thouught you would just do one round of reclaim after nr_reclaimed >= nr_to_reclaim which din't feel right to me. Sorry about the confusion! -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>