Dear David,
Thank you for your reply.
Am 09.06.21 um 13:17 schrieb David Hildenbrand:
On 04.06.21 13:36, Paul Menzel wrote:
On a 1 TB RAM compute server with Linux 5.10.24 and memory
overcommitting disabled, we ran into a situation where processes like
SSH couldn’t allocate memory anymore.
$ more /proc/cmdline
BOOT_IMAGE=/boot/bzImage-5.10.24.mx64.375 root=LABEL=root ro crashkernel=256M console=ttyS0,115200n8 console=tty0 init=/bin/systemd audit=0 random.trust_cpu=on
2021-06-03T22:00:28+02:00 godsavethequeen sshd[89163]: pam_systemd(sshd:session): Failed to create session: Unit session-25654.scope not found.
2021-06-03T22:00:29+02:00 godsavethequeen sshd[89163]: error: do_exec_no_pty: fork: Cannot allocate memory
2021-06-03T22:00:29+02:00 godsavethequeen sshd[89163]: pam_unix(sshd:session): session closed for user root
2021-06-03T22:01:41+02:00 godsavethequeen sshd[1834]: error: fork: Cannot allocate memory
2021-06-03T22:01:41+02:00 godsavethequeen sshd[1834]: error: ssh_msg_send: write: Broken pipe
2021-06-03T22:01:41+02:00 godsavethequeen sshd[1834]: error: send_rexec_state: ssh_msg_send failed
$ free -h
total used free shared buff/cache available
Mem: 1.0T 606G 2.6G 2.2M 395G 391G
Swap: 0B 0B 0B
Looking at this, I would have expected, that the pages(?) in buff/cache
would be moved/deleted to make memory available.
Looking at `/proc/meminfo` (attached):
MemTotal: 1052411824 kB
MemFree: 2709976 kB
MemAvailable: 410847908 kB
[…]
CommitLimit: 1052411824 kB
Committed_AS: 1052455260 kB
[…]
With memory overcommit disabled, each accountable mapping
(mm/mmap.c:accountable_mapping()) will count towards Committed_AS. So
you might still have plenty of free memory in the system reserved for
these mappings, yet Linux won't allow for more accountable mappings.
That's why you see the "Cannot allocate memory" messages. mmap() failed.
Buffers: 3212 kB
Cached: 411083788 kB
SwapCached: 0 kB
Active: 303175824 kB
Inactive: 740080100 kB
Active(anon): 1448 kB
Inactive(anon): 632169724 kB
Active(file): 303174376 kB
Inactive(file): 107910376 kB
The documentation [1] describes *MemAvailable*, *Buffers*, and *Cached*:
MemAvailable
An estimate of how much memory is available for starting new
applications, without swapping. Calculated from MemFree,
SReclaimable, the size of the file LRU lists, and the low
watermarks in each zone.
The estimate takes into account that the system needs some
page cache to function well, and that not all reclaimable
slab will be reclaimable, due to items being in use. The
impact of those factors will vary from system to system.
Buffers
Relatively temporary storage for raw disk blocks
shouldn't get tremendously large (20MB or so)
Cached
in-memory cache for files read from the disk (the
pagecache). Doesn't include SwapCached
So I would have assumed, the kernel removes files from the in-memory
cache for files.
Committed_AS is greater than the commit limit (total memory).
Is such behavior expected?
We're talking about 43436 kB that exceed the CommitLimit.
The CommitLimit might change (grow/shrink) when
a) The number of hugetlb pages changes
b) Swap space is resized
If CommitLimit did not change, Committed_AS should actually not exceed
it. IIUC, it can only happen temporarily while trying creation of a new
mapping. We increase Committed_AS unconditionally and decrease it again
if we reject it.
I can’t say for sure, as the system was rebooted, but I thought the
value stayed the same.
Kind regards,
Paul
[1]: https://www.kernel.org/doc/html/latest/filesystems/proc.html#meminfo