nfs client caching issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear nfs developers,

We are running gitea in k8s in two pods, the two pods are running on different vms. Gitea datadir is mounted as an nfs share, and we periodically experience a caching issue, where one of the pods is out-of-sync with directory entries:

# kubectl -n ci-cd exec -it gitea-546489b89b-c8j5j -- sh -c 'cd git/repositories/.../objects/pack && ls -la'
total 32627
drwxr-xr-x    2 git      git              5 Nov 16 08:59 .
drwxr-xr-x    4 git      git              4 Nov 16 08:59 ..
-r--r--r-- 1 git git 28634 Nov 16 08:59 pack-f3648f5e8e42d671dee4868d08fe24fa50b47fac.bitmap -r--r--r-- 1 git git 134744 Nov 16 08:59 pack-f3648f5e8e42d671dee4868d08fe24fa50b47fac.idx -r--r--r-- 1 git git 33191104 Nov 16 08:59 pack-f3648f5e8e42d671dee4868d08fe24fa50b47fac.pack

# kubectl -n ci-cd exec -it gitea-546489b89b-d7lcn -- sh -c 'cd git/repositories/.../objects/pack && ls -la' ls: ./pack-249aee1788eeaca050d1c083e6598d675ba1017e.pack: No such file or directory ls: ./pack-249aee1788eeaca050d1c083e6598d675ba1017e.bitmap: No such file or directory ls: ./pack-249aee1788eeaca050d1c083e6598d675ba1017e.idx: No such file or directory
total 25
drwxr-xr-x    2 git      git              5 Nov 16 08:59 .
drwxr-xr-x    4 git      git              4 Nov 16 08:59 ..
command terminated with exit code 1

The second pod has cached directory entries, howewer, they are not present. But, if I stat the existing files on the failing pod, it succeeds: # kubectl -n ci-cd exec -it gitea-546489b89b-d7lcn -- sh -c 'cd git/repositories/.../objects/pack && stat pack-f3648f5e8e42d671dee4868d08fe24fa50b47fac.bitmap'
  File: pack-f3648f5e8e42d671dee4868d08fe24fa50b47fac.bitmap
  Size: 28634     	Blocks: 41         IO Block: 131072 regular file
Device: c0h/192d	Inode: 98069       Links: 1
Access: (0444/-r--r--r--)  Uid: ( 1000/     git)   Gid: ( 1000/     git)
Access: 2023-11-16 08:59:38.176448632 +0000
Modify: 2023-11-16 08:59:38.177839293 +0000
Change: 2023-11-16 08:59:38.203249724 +0000

Seems that the directory metadata is the same on the nodes:
# kubectl -n ci-cd exec -it gitea-546489b89b-c8j5j -- sh -c 'cd git/repositories/.../objects/pack && stat .'
  File: .
  Size: 5         	Blocks: 49         IO Block: 131072 directory
Device: 87h/135d	Inode: 15013       Links: 2
Access: (0755/drwxr-xr-x)  Uid: ( 1000/     git)   Gid: ( 1000/     git)
Access: 2023-11-16 09:06:02.594066258 +0000
Modify: 2023-11-16 08:59:38.232154965 +0000
Change: 2023-11-16 08:59:38.232154965 +0000

# kubectl -n ci-cd exec -it gitea-546489b89b-d7lcn -- sh -c 'cd git/repositories/.../objects/pack && stat .'
  File: .
  Size: 5         	Blocks: 49         IO Block: 131072 directory
Device: c0h/192d	Inode: 15013       Links: 2
Access: (0755/drwxr-xr-x)  Uid: ( 1000/     git)   Gid: ( 1000/     git)
Access: 2023-11-16 09:06:02.594066258 +0000
Modify: 2023-11-16 08:59:38.232154965 +0000
Change: 2023-11-16 08:59:38.232154965 +0000

Issuing ls on the failing generates getattr for the directory. I assume it receives the already cached metadata, then assumes there were no changes, and then tries to stat() the cached 3 files with no success.

It is enough to just touch the affected directory even on the other node, this makes the failing node to recover, get in sync again.

Both nodes are running Debian stable kernel:
# uname -a
Linux node 6.1.0-13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.55-1 (2023-09-29) x86_64 GNU/Linux

The nfs server is a TrueNAS server (FreeBSD).

We have default mount options:

# mount | grep nfs4
x.x.x.x:/mnt/main/e-sz-k8s/csi/rgbcpj4nw9tywroxrqw8bpw1zmzyu5mg on /var/lib/kubelet/pods/69d3e536-d0a9-4c02-9461-8f14d058ea60/volumes/kubernetes.io~csi/pvc-c76ad5da-9c1b-4692-b768-5469a9517d57/mount type nfs4 (rw,relatime,vers=4.2,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=c.c.c.c,local_lock=none,addr=x.x.x.x)

Is it cache configuration issue, or a bug in linux nfs client or freebsd nfs server code?

Thanks in advance,
Richard



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux