On Fri, Feb 02, 2024 at 01:02:47PM +0800, Efly Young <yangyifei03@xxxxxxxxxxxx> wrote: > > Looking at the code, I'm not quite sure if this can be read this > > literally. Efly might be able to elaborate, but we do a full loop of > > all nodes and cgroups in the tree before checking nr_to_reclaimed, and > > rely on priority level for granularity. So request size and complexity > > of the cgroup tree play a role. I don't know where the exact factor > > two would come from. > > I'm sorry that this conclusion may be arbitrary. It might just only suit > for my case. In my case, I traced it loop twice every time before checking > nr_reclaimed, and it reclaimed less than my request size(1G) every time. > So I think the upper bound is 2 * request. But now it seems that this is > related to cgroup tree I constucted and my system status and my request > size(a relatively large chunk). So there are many influencing factors, > a specific upper bound is not accurate. Alright, thanks for the background. > > IMO it's more accurate to phrase it like this: > > > > Reclaim tries to balance nr_to_reclaim fidelity with fairness across > > nodes and cgroups over which the pages are spread. As such, the bigger > > the request, the bigger the absolute overreclaim error. Historic > > in-kernel users of reclaim have used fixed, small request batches to > > approach an appropriate reclaim rate over time. When we reclaim a user > > request of arbitrary size, use decaying batches to manage error while > > maintaining reasonable throughput. Hm, decay... So shouldn't the formula be nr_pages = delta <= SWAP_CLUSTER_MAX ? delta : (delta + 3*SWAP_CLUSTER_MAX) / 4 where delta = nr_to_reclaim - nr_reclaimed ? (So that convergence for smaller deltas is same like original- and other reclaims while conservative factor is applied for effectivity of higher user requests.) Thanks, Michal
Attachment:
signature.asc
Description: PGP signature