On Mon, Nov 16 2020, Trond Myklebust wrote: > On Mon, 2020-11-16 at 16:00 +1100, NeilBrown wrote: >> On Mon, Nov 16 2020, Trond Myklebust wrote: >> >> > On Mon, 2020-11-16 at 15:43 +1100, NeilBrown wrote: >> > > On Mon, Nov 16 2020, Trond Myklebust wrote: >> > > >> > > > On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote: >> > > > > >> > > > > Prior to commit 5ceb9d7fdaaf ("NFS: Refactor >> > > > > nfs_lookup_revalidate()") >> > > > > and error from nfs_lookup_verify_inode() other than -ESTALE >> > > > > would >> > > > > result >> > > > > in nfs_lookup_revalidate() returning that error code (-ESTALE >> > > > > is >> > > > > mapped >> > > > > to zero). >> > > > > Since that commit, all errors result in zero being returned. >> > > > > >> > > > > When nfs_lookup_revalidate() returns zero, the dentry is >> > > > > invalidated >> > > > > and, significantly, if the dentry is a directory that is >> > > > > mounted >> > > > > on, >> > > > > that mountpoint is lost. >> > > > > >> > > > > If you: >> > > > > - mount an NFS filesystem which contains a directory >> > > > > - mount something (e.g. tmpfs) on that directory >> > > > > - use iptables (or scissors) to block traffic to the server >> > > > > - ls -l the-mounted-on-directory >> > > > > - interrupt the 'ls -l' >> > > > > you will find that the directory has been unmounted. >> > > > > >> > > > > This can be fixed by returning the actual error code from >> > > > > nfs_lookup_verify_inode() rather then zero (except for - >> > > > > ESTALE). >> > > > > >> > > > > Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()") >> > > > > Signed-off-by: NeilBrown <neilb@xxxxxxx> >> > > > > --- >> > > > > fs/nfs/dir.c | 8 +++++--- >> > > > > 1 file changed, 5 insertions(+), 3 deletions(-) >> > > > > >> > > > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c >> > > > > index cb52db9a0cfb..d24acf556e9e 100644 >> > > > > --- a/fs/nfs/dir.c >> > > > > +++ b/fs/nfs/dir.c >> > > > > @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct inode >> > > > > *dir, >> > > > > struct dentry *dentry, >> > > > > unsigned int flags) >> > > > > { >> > > > > struct inode *inode; >> > > > > - int error; >> > > > > + int error = 0; >> > > > > >> > > > > nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE); >> > > > > inode = d_inode(dentry); >> > > > > @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct inode >> > > > > *dir, >> > > > > struct dentry *dentry, >> > > > > nfs_check_verifier(dir, dentry, flags & >> > > > > LOOKUP_RCU)) >> > > > > { >> > > > > error = nfs_lookup_verify_inode(inode, >> > > > > flags); >> > > > > if (error) { >> > > > > - if (error == -ESTALE) >> > > > > + if (error == -ESTALE) { >> > > > > nfs_zap_caches(dir); >> > > > > + error = 0; >> > > > > + } >> > > > > goto out_bad; >> > > > > } >> > > > > nfs_advise_use_readdirplus(dir); >> > > > > @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct inode >> > > > > *dir, >> > > > > struct dentry *dentry, >> > > > > out_bad: >> > > > > if (flags & LOOKUP_RCU) >> > > > > return -ECHILD; >> > > > > - return nfs_lookup_revalidate_done(dir, dentry, inode, >> > > > > 0); >> > > > > + return nfs_lookup_revalidate_done(dir, dentry, inode, >> > > > > error); >> > > > >> > > > Which errors do we actually need to return here? As far as I >> > > > can >> > > > tell, >> > > > the only errors that nfs_lookup_verify_inode() is supposed to >> > > > return is >> > > > ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT. >> > > > >> > > > Why would it be better to return those errors rather than just >> > > > a 0 >> > > > when >> > > > we need to invalidate the inode, particularly since we already >> > > > have >> > > > a >> > > > special case in nfs_lookup_revalidate_done() when the dentry is >> > > > root? >> > > >> > > ERESTARTSYS is the error that easily causes problems. >> > > >> > > Returning 0 causes d_invalidate() to be called which is quite >> > > heavy >> > > handed in mountpoints. >> > >> > My point is that it shouldn't get returned for mountpoints. See >> > nfs_lookup_revalidate_done(). >> >> nfs_lookup_revalidate_done() only checks IS_ROOT(), and while many >> mountpoints are IS_ROOT(), not all are (--bind easily makes others). >> >> But that isn't even really relevant here. The dentry being >> revalidated >> is the underlying directory - that something else is mounted on. >> step_into() which follows mount points is called in walk_component() >> *after* lookup_fast or lookup_slow which will have revalidated the >> dentry. > > So then why is it not sufficient to just add a check for > d_mountpoint()? This is a revalidation, not a new lookup. > I guess you could do that. But why would you want to call d_invalidate() just because a signal was received, or a memory allocation failed? NeilBrown >> >> NeilBrown >> >> >> > >> > > So it is only reasonable to return 0 when we have unambiguous >> > > confirmation from the server that the object no longer exists. >> > > ESTALE >> > > is unambiguous. EIO might be unambiguous. ERESTARTSYS, ENOMEM, >> > > ETIMEDOUT are transient and don't justify d_invalidate() being >> > > called. >> > > >> > > (BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that are >> > > clearly invalid.") >> > > fixed much the same bug 3 years ago). >> > > >> > > Thanks, >> > > NeilBrown >> > > >> > > >> > > > >> > > > > } >> > > > > >> > > > > static int >> > > > >> > > > -- >> > > > Trond Myklebust >> > > > Linux NFS client maintainer, Hammerspace >> > > > trond.myklebust@xxxxxxxxxxxxxxx >> > >> > -- >> > Trond Myklebust >> > CTO, Hammerspace Inc >> > 4984 El Camino Real, Suite 208 >> > Los Altos, CA 94022 >> > >> > www.hammer.space > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@xxxxxxxxxxxxxxx
Attachment:
signature.asc
Description: PGP signature