2024年6月14日(金) 1:49 Shakeel Butt <shakeel.butt@xxxxxxxxx>: > > On Thu, Jun 13, 2024 at 08:04:39AM GMT, Nhat Pham wrote: > [...] > > > > > > > > > > Is the idea here to avoid moving the iterator to another offline memcg > > > > > that zswap_memcg_offline_cleanup() was already called for, to avoid > > > > > holding a ref on that memcg until the next run of zswap shrinking? > > > > > > > > > > If yes, I think it's probably worth doing. But why do we need to > > > > > release and reacquire the lock in the loop above? > > > > > > > > Yes, the existing cleaner might leave the offline, already-cleaned memcg. > > > > > > > > The reacquiring lock is to not loop inside the critical section. > > > > In shrink_worker of v0 patch, the loop was restarted on offline memcg > > > > without releasing the lock. Nhat pointed out that we should drop the > > > > lock after every mem_cgroup_iter() call. v1 was changed to reacquire > > > > once per iteration like the cleaner code above. > > > > > > I am not sure how often we'll run into a situation where we'll be > > > holding the lock for too long tbh. It should be unlikely to keep > > > encountering offline memcgs for a long time. > > > > > > Nhat, do you think this could cause a problem in practice? > > > > I don't remember prescribing anything to be honest :) I think I was > > just asking why can't we just drop the lock, then "continue;". This is > > mostly for simplicity's sake. > > > > https://lore.kernel.org/linux-mm/CAKEwX=MwrRc43iM2050v5u-TPUK4Yn+a4G7+h6ieKhpQ7WtQ=A@xxxxxxxxxxxxxx/ I apologize for misinterpreting your comments. Removing release/reacquire. > > > > But I think as Takero pointed out, it would still skip over the memcg > > that was (concurrently) updated to zswap_next_shrink by the memcg > > offline callback. > > What's the issue with keep traversing until an online memcg is found? > Something like the following: > > > spin_lock(&zswap_shrink_lock); > do { > zswap_next_shrink = mem_cgroup_iter(NULL, zswap_next_shrink, NULL); > } while (zswap_next_shrink && !mem_cgroup_online(zswap_next_shrink)); > > if (!zswap_next_shrink) > zswap_next_shrink = mem_cgroup_iter(NULL, NULL, NULL); > .... > > Is the concern that there can a lot of offlined memcgs which may cause > need resched warnings? To avoid using the goto-based loop, here's the new version, including Shakeel's suggestion: ```c do { spin_lock(&zswap_shrink_lock); /* * Advance the cursor to start shrinking from the next memcg * after zswap_next_shrink. One memcg might be skipped from * shrinking if the cleaner also advanced the cursor, but it * will happen at most once per offlining memcg. */ do { zswap_next_shrink = mem_cgroup_iter(NULL, zswap_next_shrink, NULL); memcg = zswap_next_shrink; } while (memcg && !mem_cgroup_tryget_online(memcg)); if (!memcg) { spin_unlock(&zswap_shrink_lock); ``` We can add or remove `spin_unlock();spin_lock();` just after mem_cgroup_iter(), if needed. I believe the behavior is identical to v1 except for the starting point of iteration. For Naht's comment, 2. No skipping over zswap_next_shrink updated by the memcg offline cleaner. While this was true for v1, I'm moved to accept this skipping as it's negligibly rare. As Yorsy commented, v1 retried the last memcg from the last shrink_worker() run. There are several options for shrink_worker where to start with: 1. Starting from the next memcg after zswap_next_shrink: It might skip one memcg, but this is quite rare. It is the current behavior before patch. 2. Starting from zswap_next_shrink: It might shrink one more page from the memcg in addition to the one by the last shrink_worker() run. This should also be rare, but probably more frequent than option 1. This is the v0 patch behavior. 3. Addressing both: Save the last memcg as well. The worker checks if it has been modified by the cleaner and advances only if it hasn't. Like this: ```c do { if (zswap_last_shrink == zswap_next_shrink) { zswap_next_shrink = mem_cgroup_iter(NULL, zswap_next_shrink, NULL); } memcg = zswap_next_shrink; } while (memcg && !mem_cgroup_tryget_online(memcg)); zswap_last_shrink = memcg; ``` Which one would be better? or any other idea?