On Mon, 27 Aug 2018 09:26:21 -0700 Roman Gushchin <guro@xxxxxx> wrote: > I've noticed, that dying memory cgroups are often pinned > in memory by a single pagecache page. Even under moderate > memory pressure they sometimes stayed in such state > for a long time. That looked strange. > > My investigation showed that the problem is caused by > applying the LRU pressure balancing math: > > scan = div64_u64(scan * fraction[lru], denominator), > > where > > denominator = fraction[anon] + fraction[file] + 1. > > Because fraction[lru] is always less than denominator, > if the initial scan size is 1, the result is always 0. > > This means the last page is not scanned and has > no chances to be reclaimed. > > Fix this by rounding up the result of the division. > > In practice this change significantly improves the speed > of dying cgroups reclaim. > > ... > > --- a/include/linux/math64.h > +++ b/include/linux/math64.h > @@ -281,4 +281,6 @@ static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 divisor) > } > #endif /* mul_u64_u32_div */ > > +#define DIV64_U64_ROUND_UP(ll, d) div64_u64((ll) + (d) - 1, (d)) This macro references arg `d' more than once. That can cause problems if the passed expression has side-effects and is poor practice. Can we please redo this with a temporary? > #endif /* _LINUX_MATH64_H */ > diff --git a/mm/vmscan.c b/mm/vmscan.c > index d649b242b989..2c67a0121c6d 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2446,9 +2446,11 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg, > /* > * Scan types proportional to swappiness and > * their relative recent reclaim efficiency. > + * Make sure we don't miss the last page > + * because of a round-off error. > */ > - scan = div64_u64(scan * fraction[file], > - denominator); > + scan = DIV64_U64_ROUND_UP(scan * fraction[file], > + denominator); > break; > case SCAN_FILE: > case SCAN_ANON: