Re: [PATCH v1] proc: Implement /proc/self/meminfo

"Enrico Weigelt, metux IT consult" <lkml@xxxxxxxxx> · Fri, 11 Jun 2021 12:37:36 +0200

On 09.06.21 22:31, Johannes Weiner wrote:

Alex has found applications that are trying to do something with
meminfo, and the fields that those applications care about.  I don't see
anyone making the case that specifically what the applications are
trying to do is buggy.

Actually, I do. The assumptions made by those applications are most
certainly wrong - have been wrong ever since.

The problem is that just looking on these numbers is only helpful if
that application is pretty much alone on the machine. You'll find those
when the vendor explicitly tells you to run it on a standalone machine,
disable swap, etc. Database systems are probably the most prominent
sector here.

Onder certain circumstances this actually migh, heuristically, work,
but those kind of auto tuning (eg. automatically eat X% of ram for
buffers, etc) might work. Under certain circumstance.

Those assumptions already had been defeated by VMs (eg. overcommit).

I don't feel it's a good idea to add extra kernel code, just in order
to make some ill-designed applications work a little bit less bad
(which still needs to be changed anyways)

Don't get me wrong, I'm really in favour of having a clean interface
for telling applications how much resources they can take (actually
considered quite the same), but this needs to be very well thought (and
we should also get orchestration folks into the loop for that). This
basically falls into two categories:

a) hard limits - how much can an application possibly consume
   --> what makes about an application ? process ? process group ?
       cgroup ? namespace ?
b) reasonable defaults - how much can an application take at will, w/o
   affecting others ?
   --> what exactly is "reasonable" ? kernel cache vs. userland buffers?
   --> how to deal w/ overcommit scenarios ?
   --> who shall be in charge of controlling these values ?

It's a very complex problem, not easy to solve. Much of that seems to be
a matter of policies, and depending on actual workloads.

Maybe, for now, it's better pursue that on orchestration level.

Not all the information at the system level translates well to the
container level. Things like available memory require a hierarchical
assessment rather than just a look at the local level, since there
could be limits higher up the tree.

By the way: what exactly *is* a container anyways ?

The mainline kernel (in contrast to openvz kernel) doesn't actually know
about containers at all - instead is provides several concepts like
namespaces, cgroups, etc, that together are used for providing some
container environment - but that composition is done in userland, and
there're several approaches w/ different semantics.

Before we can do anything container specific in the kernel, we first
have to come to an general aggreement what actually is a container from
kernel perspective. No idea whether we can achieve that at all (in near
future) w/o actually introducing the concept of container within the
kernel.

We should also not speculate what users intended to do with the
meminfo data right now. There is a surprising amount of misconception
around what these values actually mean. I'd rather have users show up
on the mailing list directly and outline the broader usecase.

ACK.

The only practical use cases I'm aware of is:

a) safety: know how much memory I can eat, until I get -ENOMEM, so
   applications can proactively take counter measures, eg. pre
   allocation (common practise in safety related applications)

b) autotuning: how much shall the application take for caches or
   buffers. this is problematic, since it can only work on heuristics,
   which in turn can only be experimentally found within certain range
   of assumptions (eg. certain databases like to do that). By that way
   you can only find more or less reasonable parameters for the majority
   of cases (assuming you have an idea what that majority actually is),
   but still far from optimal. for *good* parameters you need to measure
   your actual workloads and applying good knowledge of what this
   application is actually doing. (one of the DBA's primary jobs)

--mtx
--
---
Hinweis: unverschlüsselte E-Mails können leicht abgehört und manipuliert
werden ! Für eine vertrauliche Kommunikation senden Sie bitte ihren
GPG/PGP-Schlüssel zu.
---
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@xxxxxxxxx -- +49-151-27565287