On Fri, Dec 01, 2023 at 10:33:01AM +0100, Michal Hocko wrote: > On Thu 30-11-23 11:56:42, Johannes Weiner wrote: > [...] > > So I wouldn't say it's merely a reclaim hint. It controls a very > > concrete and influential factor in VM decision making. And since the > > global swappiness is long-established ABI, I don't expect its meaning > > to change significantly any time soon. > > As I've said I am more worried about potential future changes which > would modify existing, reduce or add more corner cases which would be > seen as a change of behavior from the user space POV. That means that we > would have to be really explicit about the fact that the reclaim is free > to override the swappiness provided by user. So essentially a best > effort interface without any actual guarantees. That surely makes it > harder to use. Is it still useable? But it's not free to override the setting as it pleases. I wrote a detailed list of the current exceptions, and why the user wouldn't have strong expectations of swappiness being respected in those cases. Having reasonable limitations is not the same as everything being up for grabs. Again, the swappiness setting is ABI, and people would definitely complain if we ignored their request in an unexpected situation and regressed their workloads. I'm not against documenting the exceptions and limitations. Not just for proactive reclaim, but for swappiness in general. But I don't think it's fair to say that there are NO rules and NO userspace contract around this parameter (and I'm the one who wrote most of the balancing code that implements the swappiness control). So considering what swappiness DOES provide, and the definition and behavior to which we're tied by ABI rules, yes I do think it's useful to control this from the proactive reclaim context. In fact, we know it's useful, because we've been doing it for a while in production now - just in a hacky way, and this patch is merely making it less hacky. > Btw. IIRC these concerns were part of the reason why memcg v2 doesn't > have swappiness interface. If we decide to export swappiness via > memory.reclaim interface does it mean we will do so on per-memcg level > as well? Well I'm the person who wrote the initial cgroup2 memory interface, and I left it out because there was no clear usecase for why you'd want to tweak it on a per-container basis. But Dan did bring up a new and very concrete usecase: controlling for write endurance. And it's not just a theoretical one, but a proven real world application. As far as adding a static memory.swappiness goes, I wouldn't add it just because, but wait for a concrete usecase for that specifically. I don't think Dan's rationale extends to it. But if a usecase comes up and is convincing, I wouldn't be opposed to it.