On 05.04.24 20:43, Yosry Ahmed wrote:
On Fri, Apr 5, 2024 at 8:26 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
On Fri, Apr 05, 2024 at 05:35:07AM +0000, Yosry Ahmed wrote:
Currently, we calculate the zswap global limit, and potentially the
acceptance threshold in the zswap, in pages in the zswap store path.
This is unnecessary because the values rarely change.
Instead, precalculate the them when the module parameters are updated,
which should be rare. Since we are adding custom handlers for setting
the percentages now, add proper validation that they are <= 100.
Suggested-by: Johannes Weiner <hannes@xxxxxxxxxxx>
Signed-off-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
Nice! Getting that stuff out of the hotpath!
Two comments below:
@@ -684,6 +703,43 @@ static int zswap_enabled_param_set(const char *val,
return ret;
}
+static int __zswap_percent_param_set(const char *val,
+ const struct kernel_param *kp)
+{
+ unsigned int n;
+ int ret;
+
+ ret = kstrtouint(val, 10, &n);
+ if (ret || n > 100)
+ return -EINVAL;
+
+ return param_set_uint(val, kp);
+}
+
+static int zswap_max_pool_param_set(const char *val,
+ const struct kernel_param *kp)
+{
+ int err = __zswap_percent_param_set(val, kp);
+
+ if (!err) {
+ zswap_update_max_pages();
+ zswap_update_accept_thr_pages();
+ }
+
+ return err;
+}
+
+static int zswap_accept_thr_param_set(const char *val,
+ const struct kernel_param *kp)
+{
+ int err = __zswap_percent_param_set(val, kp);
+
+ if (!err)
+ zswap_update_accept_thr_pages();
+
+ return err;
+}
I think you can keep this simple and just always update both if
anything changes for whatever reason. It's an extremely rare event
after all. That should cut it from 3 functions to 1.
Yeah we can also combine both update functions in this case.
Note that totalram_pages can also change during memory onlining and
offlining. For that you need a memory notifier that also calls that
refresh function. It's simple enough, though, check out the code
around register_memory_notifier() in drivers/xen/balloon.c.
Good point, I completely missed this. It seems like totalram_pages can
actually change from contexts other than memory hotplug as well,
specifically through adjust_managed_page_count(), and mostly through
ballooning drivers. Do we trigger the notifiers in this case? I can't
find such logic.
Things like virtio-balloon never online/offline memory and would never
call it.
Things like virtio-mem sometimes will online/offline memory and would
sometimes call it (but not always). Things like the Hyper-V balloon and
XEN balloon never offline memory, and would only call it when onlining
memory.
It seems like in this case the actual amount of memory does not
change, but the drivers take it away from the system. It makes some
sense to me that the zswap limits do not change in this case,
especially that userspace just sets those limits as a percentage of
total memory. I wouldn't expect userspace to take ballooning into
account here.
For virtio-mem, it does change ("actual amount of memory"). For
virtio-balloon, it's tricky. When using virtio-balloon for VM resizing,
it would similarly change. When using it for pure memory overcommit, it
depends on whatever the policy in the hypervisor is ... might be that
under memory pressure that memory is simply given back to the VM.
However, it would be a behavioral change from today where we always
rely on totalram_pages(). Also, if userspace happens to change the
limit when a driver is holding a big chunk of memory away from
totalram_pages, then the limit would be constantly underestimated.
I do not have enough familiarity with memory ballooning to know for
sure if this is okay. How much memory can memory ballooning add/remove
from totalram_pages(), and is it usually transient or does it stick
around for a while?
Also CCing David here.
It can be a lot. Especially with the Hyper-V balloon (but also on
environments where other forms of memory hotunplug are not supported),
memory ballooning can be used to unplug memory. So that memory can be
gone for good and it can end up being quite a lot of memory.
The clean thing to do would be to have a way for other subsystems to get
notified on any totalram_pages() changes, so they can adjust accordingly.
--
Cheers,
David / dhildenb