On 09.06.21 22:31, Johannes Weiner wrote:
Alex has found applications that are trying to do something with meminfo, and the fields that those applications care about. I don't see anyone making the case that specifically what the applications are trying to do is buggy.
Actually, I do. The assumptions made by those applications are most certainly wrong - have been wrong ever since. The problem is that just looking on these numbers is only helpful if that application is pretty much alone on the machine. You'll find those when the vendor explicitly tells you to run it on a standalone machine, disable swap, etc. Database systems are probably the most prominent sector here. Onder certain circumstances this actually migh, heuristically, work, but those kind of auto tuning (eg. automatically eat X% of ram for buffers, etc) might work. Under certain circumstance. Those assumptions already had been defeated by VMs (eg. overcommit). I don't feel it's a good idea to add extra kernel code, just in order to make some ill-designed applications work a little bit less bad (which still needs to be changed anyways) Don't get me wrong, I'm really in favour of having a clean interface for telling applications how much resources they can take (actually considered quite the same), but this needs to be very well thought (and we should also get orchestration folks into the loop for that). This basically falls into two categories: a) hard limits - how much can an application possibly consume --> what makes about an application ? process ? process group ? cgroup ? namespace ? b) reasonable defaults - how much can an application take at will, w/o affecting others ? --> what exactly is "reasonable" ? kernel cache vs. userland buffers? --> how to deal w/ overcommit scenarios ? --> who shall be in charge of controlling these values ? It's a very complex problem, not easy to solve. Much of that seems to be a matter of policies, and depending on actual workloads. Maybe, for now, it's better pursue that on orchestration level.
Not all the information at the system level translates well to the container level. Things like available memory require a hierarchical assessment rather than just a look at the local level, since there could be limits higher up the tree.
By the way: what exactly *is* a container anyways ? The mainline kernel (in contrast to openvz kernel) doesn't actually know about containers at all - instead is provides several concepts like namespaces, cgroups, etc, that together are used for providing some container environment - but that composition is done in userland, and there're several approaches w/ different semantics. Before we can do anything container specific in the kernel, we first have to come to an general aggreement what actually is a container from kernel perspective. No idea whether we can achieve that at all (in near future) w/o actually introducing the concept of container within the kernel.
We should also not speculate what users intended to do with the meminfo data right now. There is a surprising amount of misconception around what these values actually mean. I'd rather have users show up on the mailing list directly and outline the broader usecase.
ACK. The only practical use cases I'm aware of is: a) safety: know how much memory I can eat, until I get -ENOMEM, so applications can proactively take counter measures, eg. pre allocation (common practise in safety related applications) b) autotuning: how much shall the application take for caches or buffers. this is problematic, since it can only work on heuristics, which in turn can only be experimentally found within certain range of assumptions (eg. certain databases like to do that). By that way you can only find more or less reasonable parameters for the majority of cases (assuming you have an idea what that majority actually is), but still far from optimal. for *good* parameters you need to measure your actual workloads and applying good knowledge of what this application is actually doing. (one of the DBA's primary jobs) --mtx -- --- Hinweis: unverschlüsselte E-Mails können leicht abgehört und manipuliert werden ! Für eine vertrauliche Kommunikation senden Sie bitte ihren GPG/PGP-Schlüssel zu. --- Enrico Weigelt, metux IT consult Free software and Linux embedded engineering info@xxxxxxxxx -- +49-151-27565287