Repeated lvcreate&lvremove quickly increases page cache size

Ryotaro Banno (伴野良太郎) <rbanno@xxxxxxxxxxxx> · Thu, 2 May 2024 05:47:34 +0000

Hi all,

When I run repeatedly `lvm lvcreate` and `lvm lvremove`, I have noticed that
the page cache size increases gradually. For example, when I used the following
commands:

```
while :; do
  sudo lvm lvcreate -n 7ee0d142-bffe-459f-87cb-56a355d24d2c -L 104857600b -W y -y ubuntu-vg;
  sudo lvm lvremove -f ubuntu-vg/7ee0d142-bffe-459f-87cb-56a355d24d2c;
  sleep 0.1;
done
```

free(1)'s buff/cache increased by about 50KiB/s on average.

This phenomenon would not be a problem if I ran LVM on the host.  In fact, I
run the same commands in a Docker container with relatively small RAM limited
by cgroup.  Then, the container is killed by OOM some time after startup,
presumably due to the increased page cache. Does anyone know how to work around
this problem?  This behavior can be reproduced as follows:

```
# Assume a VG named "ubuntu-vg" is already set up on the host.
docker run --privileged --pid=host --memory=15m -it ubuntu:22.04 bash -c '
while :; do
  /usr/bin/nsenter -m -u -i -n -p -t 1 /sbin/lvm lvcreate -n 7ee0d142-bffe-459f-87cb-56a355d24d2c -L 104857600b -W y -y ubuntu-vg
  /usr/bin/nsenter -m -u -i -n -p -t 1 /sbin/lvm lvremove -f ubuntu-vg/7ee0d142-bffe-459f-87cb-56a355d24d2c
  sleep 0.1
done'

# After a while, lvcreate gets OOM killed like:
#   bash: line 6: 4099016 Killed                  /usr/bin/nsenter -m -u -i -n -p -t 1 /sbin/lvm lvremove -f ubuntu-vg/7ee0d142-bffe-459f-87cb-56a355d24d2c
```

The same thing happens on a Kubernetes container.  For some unknown reason, the
increase in the page cache size is faster on Kubernetes than on others (host
and docker); it only increases, never decreases.  This can be reproduced using
Minikube with KVM2 as follows:

```
# Set up Minikube and LVM
minikube start --driver=kvm2
minikube ssh -- sudo truncate --size=10G backing_store
minikube ssh -- sudo losetup -f backing_store
minikube ssh -- sudo vgcreate vg1 $(minikube ssh -n minikube -- sudo losetup -j backing_store | cut -d':' -f1)

# Create a pod that repeatedly run lvcreate&lvremove
cat <<EOS | minikube kubectl -- apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  hostPID: true
  containers:
  - command:
    - bash
    - -cex
    - |
      # Run lvcreate&lvremove repeatedly
      while :; do
          /usr/bin/nsenter -m -u -i -n -p -t 1 /sbin/lvm lvcreate -n 7ee0d142-bffe-459f-87cb-56a355d24d2c -L 104857600b -W y -y vg1
          /usr/bin/nsenter -m -u -i -n -p -t 1 /sbin/lvm lvremove -f vg1/7ee0d142-bffe-459f-87cb-56a355d24d2c
          sleep 0.1
      done
    image: ubuntu:22.04
    name: lvm
    securityContext:
      privileged: true

    # Limit memory usage upto 15Mi
    resources:
      limits:
        memory: 15Mi
EOS

# Watch the pod status. It will get OOM killed continuously.
minikube kubectl -- get pod test-pod -w

# Check the memory usage statistics exposed by cgroup. The field `total_cache` should be huge.
minikube ssh -- cat /sys/fs/cgroup/memory/kubepods/burstable/pod$(minikube kubectl -- get pod test-pod -o jsonpath='{.metadata.uid}')/memory.stat

# Clean up
minikube delete --all --purge
```

Does anyone know how to work around this problem? 

My environment is as follows:

- OS: Ubuntu 22.04.4 LTS
  - `uname -a`: Linux _ 5.15.0-78-generic #85-Ubuntu SMP Fri Jul 7 15:25:09 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
- Docker version: 26.0.1
- minikube version: v1.32.0 (commit: 8220a6eb95f0a4d75f7f2d7b14cef975f050512d)

Thanks,
Ryotaro