On Thu, Feb 1, 2024 at 7:34 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > On Thu, Feb 01, 2024 at 02:57:22PM +0100, Michal Koutný wrote: > > Hello. > > > > On Wed, Jan 31, 2024 at 04:24:41PM +0000, "T.J. Mercier" <tjmercier@xxxxxxxxxx> wrote: > > > reclaimed = try_to_free_mem_cgroup_pages(memcg, > > > - min(nr_to_reclaim - nr_reclaimed, SWAP_CLUSTER_MAX), > > > + max((nr_to_reclaim - nr_reclaimed) / 4, > > > + (nr_to_reclaim - nr_reclaimed) % 4), > > > > The 1/4 factor looks like magic. > > It's just cutting the work into quarters to balance throughput with > goal accuracy. It's no more or less magic than DEF_PRIORITY being 12, > or SWAP_CLUSTER_MAX being 32. Using SWAP_CLUSTER_MAX is sort of like having a really large divisor instead of 4 (or 1 like before). I recorded the average number of iterations required to complete the 1G reclaim for the measurements I took and it looks like this: pre-0388536ac291 : 1 post-0388536ac291 : 1814 (reclaim-reclaimed)/4: 17 Given the results with /4, I don't think the perf we get here is particularly sensitive to the number we choose, but it's definitely a tradeoff. <snip> > > Also IMO importantly, when nr_to_reclaim - nr_reclaimed is less than 8, > > the formula gives arbitrary (unrelated to delta's magnitude) values. > > try_to_free_mem_cgroup_pages() rounds up to SWAP_CLUSTER_MAX. So the > error margin is much higher at the smaller end of requests anyway. > But practically speaking, users care much less if you reclaim 32 pages > when 16 were requested than if you reclaim 2G when 1G was requested. I like Johannes's suggestion of just a comment instead of the mod op. It's easier to read, slightly less generated code, and even if we didn't have the .nr_to_reclaim = max(nr_pages, SWAP_CLUSTER_MAX) in try_to_free_mem_cgroup_pages, memory_reclaim would still get very close to the target before running out of nr_retries.