Fwd: NFS Caching broken in 4.19.37

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Forwarding to maintainers (apologies, did not cc on first send).

A.

-------- Forwarded Message --------
Subject: NFS Caching broken in 4.19.37
Date: Mon, 8 Jul 2019 19:19:54 +0100
From: Anton Ivanov <anton.ivanov@xxxxxxxxxxxxxxxxxx>
Organization: Cambridge Greys
To: Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>

Hi list,

NFS caching appears broken in 4.19.37.

The more cores/threads the easier to reproduce. Tested with identical results on Ryzen 1600 and 1600X.

1. Mount an openwrt build tree over NFS v4
2. Run make -j `cat /proc/cpuinfo | grep vendor | wc -l` ; make clean in a loop
3. Result after 3-4 iterations:

State on the client

ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm

total 8
drwxr-xr-x 2 anivanov anivanov 4096 Jul  8 11:40 ./
drwxr-xr-x 3 anivanov anivanov 4096 Jul  8 11:40 ../

State as seen on the server (mounted via nfs from localhost):

ls -laF /var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm
total 12
drwxr-xr-x 2 anivanov anivanov 4096 Jul  8 11:40 ./
drwxr-xr-x 3 anivanov anivanov 4096 Jul  8 11:40 ../
-rw-r--r-- 1 anivanov anivanov   32 Jul  8 11:40 ipcbuf.h

Actual state on the filesystem:

ls -laF /exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm
total 12
drwxr-xr-x 2 anivanov anivanov 4096 Jul  8 11:40 ./
drwxr-xr-x 3 anivanov anivanov 4096 Jul  8 11:40 ../
-rw-r--r-- 1 anivanov anivanov   32 Jul  8 11:40 ipcbuf.h

So the client has quite clearly lost the plot. Telling it to drop caches and re-reading the directory shows the file present.

It is possible to reproduce this using a linux kernel tree too, just takes much more iterations - 10+ at least.

Both client and server run 4.19.37 from Debian buster. This is filed as debian bug 931500. I originally thought it to be autofs related, but IMHO it is actually something fundamentally broken in nfs caching resulting in cache corruption.

--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux