On 04.06.21 13:36, Paul Menzel wrote:
Dear Linux folks,
On a 1 TB RAM compute server with Linux 5.10.24 and memory
overcommitting disabled, we ran into a situation where processes like
SSH couldn’t allocate memory anymore.
$ more /proc/cmdline
BOOT_IMAGE=/boot/bzImage-5.10.24.mx64.375 root=LABEL=root ro
crashkernel=256M console=ttyS0,115200n8 console=tty0 init=/bin/systemd
audit=0 random.trust_cpu=on
2021-06-03T22:00:28+02:00 godsavethequeen sshd[89163]:
pam_systemd(sshd:session): Failed to create session: Unit
session-25654.scope not found.
2021-06-03T22:00:29+02:00 godsavethequeen sshd[89163]: error:
do_exec_no_pty: fork: Cannot allocate memory
2021-06-03T22:00:29+02:00 godsavethequeen sshd[89163]:
pam_unix(sshd:session): session closed for user root
2021-06-03T22:01:41+02:00 godsavethequeen sshd[1834]: error: fork:
Cannot allocate memory
2021-06-03T22:01:41+02:00 godsavethequeen sshd[1834]: error:
ssh_msg_send: write: Broken pipe
2021-06-03T22:01:41+02:00 godsavethequeen sshd[1834]: error:
send_rexec_state: ssh_msg_send failed
$ free -h
total used free shared buff/cache
available
Mem: 1.0T 606G 2.6G 2.2M 395G
391G
Swap: 0B 0B 0B
Looking at this, I would have expected, that the pages(?) in buff/cache
would be moved/deleted to make memory available.
Looking at `/proc/meminfo` (attached):
MemTotal: 1052411824 kB
MemFree: 2709976 kB
MemAvailable: 410847908 kB
[…]
CommitLimit: 1052411824 kB
Committed_AS: 1052455260 kB
[…]
With memory overcommit disabled, each accountable mapping
(mm/mmap.c:accountable_mapping()) will count towards Committed_AS. So
you might still have plenty of free memory in the system reserved for
these mappings, yet Linux won't allow for more accountable mappings.
That's why you see the "Cannot allocate memory" messages. mmap() failed.
Committed_AS is greater than the commit limit (total memory).
Is such behavior expected?
We're talking about 43436 kB that exceed the CommitLimit.
The CommitLimit might change (grow/shrink) when
a) The number of hugetlb pages changes
b) Swap space is resized
If CommitLimit did not change, Committed_AS should actually not exceed
it. IIUC, it can only happen temporarily while trying creation of a new
mapping. We increase Committed_AS unconditionally and decrease it again
if we reject it.
--
Thanks,
David / dhildenb