I'm assuming that the revalidation is failing in cifs_revalidate_dentry, because it returns -ERESTARTSYS. Going by the documentation of d_revalidate: > This function should return a positive value if the dentry is still > valid, and zero or a negative error code if it isn't. In case of error, can we try returning the rc itself (rather than 0), and see if VFS avoids a dentry put? Because theoretically, the call execution has failed, and the dentry is not found to be invalid. Regards, Shyam On Fri, Jan 29, 2021 at 9:16 AM Aurélien Aptel <aaptel@xxxxxxxx> wrote: > > From: Aurelien Aptel <aaptel@xxxxxxxx> > > Assuming > - //HOST/a is mounted on /mnt > - //HOST/b is mounted on /mnt/b > > On a slow connection, running 'df' and killing it while it's > processing /mnt/b can make cifs_get_inode_info() returns -ERESTARTSYS. > > This triggers the following chain of events: > => the dentry revalidation fail > => dentry is put and released > => superblock associated with the dentry is put > => /mnt/b is unmounted > > This quick fix makes cifs_d_revalidate() always succeed (mark as > valid) on cifs dentries which are also mount points. > > Signed-off-by: Aurelien Aptel <aaptel@xxxxxxxx> > Signed-off-by: Paulo Alcantara (SUSE) <pc@xxxxxx> > --- > > I have managed to reproduce this bug with the following script. It > uses tc with netem discipline (CONFIG_NET_SCH_NETEM=y) to simulate > network delays. > > #!/bin/bash > > # > # reproducing bsc#1177440 > # > # nested mount point gets unmounted when process is signaled > # > set -e > > share1=//192.168.2.110/scratch > share2=//192.168.2.110/test > opts="username=administrator,password=aaptel-42,vers=1.0,actimeo=0" > > cleanup() { > # reset delay > tc qdisc del dev eth0 root > mount|grep -q /mnt/nest/a && umount /mnt/nest/a > mount|grep -q /mnt/nest && umount /mnt/nest > > echo 'module cifs -p' > /sys/kernel/debug/dynamic_debug/control > echo 'file fs/cifs/* -p' > /sys/kernel/debug/dynamic_debug/control > echo 0 > /proc/fs/cifs/cifsFYI > echo 0 > /sys/module/dns_resolver/parameters/debug > echo 1 > /proc/sys/kernel/printk_ratelimit > > } > > trap cleanup EXIT > > nbcifs() { > mount|grep cifs|wc -l > } > > reset() { > echo "unmounting and reloading cifs.ko" > mount|grep -q /mnt/nest/a && umount /mnt/nest/a > mount|grep -q /mnt/nest && umount /mnt/nest > sleep 1 > lsmod|grep -q cifs && ( modprobe -r cifs &> /dev/null || true ) > lsmod|grep -q cifs || ( modprobe cifs &> /dev/null || true ) > } > > mnt() { > dmesg --clear > echo 'module cifs +p' > /sys/kernel/debug/dynamic_debug/control > echo 'file fs/cifs/* +p' > /sys/kernel/debug/dynamic_debug/control > echo 1 > /proc/fs/cifs/cifsFYI > echo 1 > /sys/module/dns_resolver/parameters/debug > echo 0 > /proc/sys/kernel/printk_ratelimit > > echo "mounting" > mkdir -p /mnt/nest > mount.cifs $share1 /mnt/nest -o "$opts" > mkdir -p /mnt/nest/a > mount.cifs $share2 /mnt/nest/a -o "$opts" > } > > # add fake delay > tc qdisc add dev eth0 root netem delay 300ms > > while :; do > reset > mnt > n=$(nbcifs) > echo "starting df with $n mounts" > df & > pid=$! > sleep 1.5 > kill $pid || true > x=$(nbcifs) > echo "stopping with $x mounts" > if [ $x -lt $n ]; then > echo "lost mounts" > dmesg > kernel.log > exit 1 > fi > done > > > > fs/cifs/dir.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c > index 68900f1629bff..876ef01628538 100644 > --- a/fs/cifs/dir.c > +++ b/fs/cifs/dir.c > @@ -741,6 +741,10 @@ cifs_d_revalidate(struct dentry *direntry, unsigned int flags) > if (flags & LOOKUP_RCU) > return -ECHILD; > > + /* nested cifs mount point are always valid */ > + if (d_mountpoint(direntry)) > + return 1; > + > if (d_really_is_positive(direntry)) { > inode = d_inode(direntry); > if ((flags & LOOKUP_REVAL) && !CIFS_CACHE_READ(CIFS_I(inode))) > -- > 2.29.2 > -- Regards, Shyam