On Tue, 2018-11-06 at 10:49 +1100, NeilBrown wrote: > On Mon, Nov 05 2018, Trond Myklebust wrote: > > > On Mon, 2018-11-05 at 13:10 +0530, Malahal Naineni wrote: > > > > > My reading of section 10.3.4 of RFC7530 suggests that the > > > > > client > > > > > should > > > generally compare fsid and fileid to see if two different > > > filehandles > > > refer to > > > the same object or not. > > > > Except that is wrong. As I already said in my previous email, there > > are > > servers out there in the field that are happy to serve up snapshots > > that have the exact same fsid and fileid as the original files. > > NetApp > > will for instance happily do this by default unless you explicitly > > configure it not to. > > > > > Section 10.3.4 is for files only correct? The issue here is for > > > directories. Also, Trond clearly pointed that Linux breaks > > > section > > > 10.3.4 from his email stating "We treat always different > > > filehandles > > > as if they refer to different > > > files. It has long been the case that snapshots from several > > > vendors > > > are encoded to look like the same file (same fileid + same fsid) > > > and > > > differing only by filehandle. If we were to try to consolidate > > > those > > > inodes we would end up corrupting application data." > > > > > > We don't respect either NFSv3 or NFSv4 RFCs in this regard! > > > > While RFC7530 does have section 10.3.4 that describes "a reliable > > method to determine whether two distinct filehandles represent > > distinct > > objects", as long as server vendors are shipping product that > > violates > > it, then that entire section is a moot point. > > Ignoring the spec in order to support broken servers wouldn't be my > first choice, but you do have a point. > However, this is only an issue (as far as I know) in a specific > circumstance that would not (I think) affect those servers. It's about not breaking existing setups that have been working like this for > 20 years. > If we do a lookup of a name that we already have in the dcache, and > we > get a filehandle which is different from the cached inode, but has > the > same fsid/fileid as the cached inode, then it isn't going to be the > same > file in a different snapshot. In that case it might be reasonable to > treat it as the same file, at least when it is a directory. > i.e. same ( fsid, fileid, type, name) means same object. That's not the case for NetApp afaik. It can mean "snapshot of same object". The reason why this is allowed, and often preferred by sysadmins is that doing an "under the cover mount" in order to satisfy a fsid change is generally not scalable if you have several .snapshot entries per directory. > Maybe that would be too messy to implement, but it seems to be a > possible balance between compliance and safety. It should stop > directories from becoming "(deleted)" but shouldn't risk data > corruption. > > > Thanks, > NeilBrown > > > > BTW: Note also how the same section reminds server vendors that > > "For > > NFSv3 clients, the typical practice has been to assume for the > > purpose > > of caching that distinct filehandles represent distinct file system > > objects." > > > > However, even if a client were to follow Section 10.3.4, then > > Section > > 9.1.4 states that any open/lock/delegation stateid is associated > > with a > > _single filehandle_, and that the lock state it carries is not > > allowed > > to be consolidated per file or fileid. See also Section 9.11, which > > more explicitly describes how to treat the multiple filehandle > > case. > > > > So while NFSv4 theoretically allows for the behaviour you are > > asking > > for, it is not particularly practical to implement, and as I said, > > the > > entire Section 10.3.4 is undermined by existing server > > implementations. > > > > > Regards, Malahal. > > > > > > > > > Regards, Malahal. > > > On Mon, Nov 5, 2018 at 10:39 AM NeilBrown <neilb@xxxxxxxx> wrote: > > > > On Mon, Nov 05 2018, Malahal Naineni wrote: > > > > > > > > > > Do we know exactly why the FH changed in this particular > > > > > > circumstance? > > > > > > > > > > In this instance, this is due to a code bug but obviously, > > > > > there > > > > > are > > > > > legitimate cases where this occur with Ganesha. > > > > > > > > Good to know that bug has been found, and presumably fixed. > > > > It is not obvious to me that there are any such legitimate > > > > cases > > > > for > > > > directories. > > > > > > > > > > (I'm particularly thinking of volatile file handles). > > > > > > > > > > NFS4 RFC has "unique filehandles" concept as well. Linux NFS > > > > > client > > > > > doesn't seem to use "unique filehandles" attribute as well. > > > > > > > > A client doesn't need to use that attribute. > > > > My reading of section 10.3.4 of RFC7530 suggests that the > > > > client > > > > should > > > > generally compare fsid and fileid to see if two different > > > > filehandles refer to > > > > the same object or not. > > > > If unique_handles is known to be set for a given fsid, then > > > > different > > > > filehandles imply different files, without bothering to check > > > > the > > > > fileid. > > > > So the use of unique_handles is an optimization. > > > > > > > > I haven't looked at the Linux/NFS code to see if it conforms to > > > > 10.3.4. > > > > > > > > NeilBrown > > > > > > > > > > > > > On Mon, Nov 5, 2018 at 6:02 AM NeilBrown <neilb@xxxxxxxx> > > > > > wrote: > > > > > > On Sun, Nov 04 2018, Marc Eshel wrote: > > > > > > > > > > > > > linux-nfs-owner@xxxxxxxxxxxxxxx wrote on 11/03/2018 > > > > > > > 10:31:29 > > > > > > > PM: > > > > > > > > > > > > > > > From: NeilBrown <neilb@xxxxxxxx> > > > > > > > > To: Marc Eshel <eshel@xxxxxxxxxx>, Trond Myklebust > > > > > > > <trondmy@xxxxxxxxxxxxxxx> > > > > > > > > Cc: "bcodding\@redhat.com" <bcodding@xxxxxxxxxx>, > > > > > > > > "linux- > > > > > > > > nfs > > > > > > > > \@vger.kernel.org" <linux-nfs@xxxxxxxxxxxxxxx>, linux- > > > > > > > > nfs- > > > > > > > > owner@xxxxxxxxxxxxxxx, "malahal\@gmail.com" < > > > > > > > > malahal@xxxxxxxxx>, > > > > > > > > "mbenjami\@redhat.com" <mbenjami@xxxxxxxxxx> > > > > > > > > Date: 11/03/2018 10:41 PM > > > > > > > > Subject: Re: "(deleted)" directories > > > > > > > > Sent by: linux-nfs-owner@xxxxxxxxxxxxxxx > > > > > > > > > > > > > > > > On Fri, Nov 02 2018, Marc Eshel wrote: > > > > > > > > > > > > > > > > > One reason to have different FHs for the same file is > > > > > > > > > that a file can > > > > > > > be > > > > > > > > > linked from multiple directories. > > > > > > > > > > > > > > > > This has some based when considering filehandles for > > > > > > > > non- > > > > > > > > directories. > > > > > > > > However the original problem was with filehandles for > > > > > > > > directories..... > > > > > > > > > > > > > > This was just an example of why FH might be different, I > > > > > > > don't think we > > > > > > > depend on it for the parent information anymore. Malahal > > > > > > > listed some other > > > > > > > reasons for having different FH for the same file. I > > > > > > > believe > > > > > > > that Ganesha > > > > > > > split the FH to the key portion (the unique id of the > > > > > > > file) > > > > > > > and some other > > > > > > > information that is file system dependent. If the NFS > > > > > > > client > > > > > > > can not > > > > > > > handle the spec definition of FH maybe the spec should be > > > > > > > updated to > > > > > > > something like Ganesha does. > > > > > > > Marc. > > > > > > > > > > > > Do we know exactly why the FH changed in this particular > > > > > > circumstance? > > > > > > Is there some way to find out? > > > > > > > > > > > > The NFSv3 spec has been updated - it is called "NFSv4" (now > > > > > > 4.2). It > > > > > > says a lot more things about filehandles, but even there, > > > > > > the > > > > > > spec is > > > > > > only as good as the what has been implemented and > > > > > > tested. I'm > > > > > > pretty > > > > > > sure that there are parts of the FH spec that have never > > > > > > been > > > > > > put into > > > > > > practice - so using them would not be wise (I'm > > > > > > particularly > > > > > > thinking of > > > > > > volatile file handles). > > > > > > > > > > > > For better or worse, Linux requires directories to have > > > > > > stable > > > > > > filehandles for NFSv3. This requirement is effectively > > > > > > imposed > > > > > > by the > > > > > > dcache. If there were some way to reliably check if two > > > > > > filehandles > > > > > > referred to the same directory, then we could relax that > > > > > > restriction, > > > > > > but I don't think there is. > > > > > > > > > > > > I think the other possible reason mentioned for changing > > > > > > the > > > > > > filehandle > > > > > > is to support migration. NFSv3 definitely doesn't support > > > > > > migration. > > > > > > NFSv4 explicitly tries to. > > > > > > > > > > > > NeilBrown > > > > > > > > > > > > > > > > > > > > > Adding the parent inode to the FH help finding the > > > > > > > > > the > > > > > > > > > name of the > > > > > > > file by > > > > > > > > > looking for the file inode in > > > > > > > > > the parent directoy. > > > > > > > > > > > > > > > > > > > > > > > > > ....and directories have a ".." link, obviating the > > > > > > > > need to > > > > > > > > store parent > > > > > > > > information in the filehandle. > > > > > > > > > > > > > > > > NeilBrown > > > > > > > > > > > > > > > > > > > > > > > > > Marc. > > > > > > > > > > > > > > > > > > linux-nfs-owner@xxxxxxxxxxxxxxx wrote on 11/02/2018 > > > > > > > > > 05:15:42 PM: > > > > > > > > > > > > > > > > > > > From: Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> > > > > > > > > > > To: "mbenjami@xxxxxxxxxx" <mbenjami@xxxxxxxxxx> > > > > > > > > > > Cc: "bcodding@xxxxxxxxxx" <bcodding@xxxxxxxxxx>, " > > > > > > > > > > malahal@xxxxxxxxx" > > > > > > > > > > <malahal@xxxxxxxxx>, "linux-nfs@xxxxxxxxxxxxxxx" > > > > > > > > > <linux-nfs@xxxxxxxxxxxxxxx> > > > > > > > > > > Date: 11/02/2018 05:15 PM > > > > > > > > > > Subject: Re: "(deleted)" directories > > > > > > > > > > Sent by: linux-nfs-owner@xxxxxxxxxxxxxxx > > > > > > > > > > > > > > > > > > > > On Fri, 2018-11-02 at 18:07 -0400, Matt Benjamin > > > > > > > > > > wrote: > > > > > > > > > > > It sounds like a pretty good one, that goes to > > > > > > > > > > > the > > > > > > > > > > > heart of what a > > > > > > > > > > > specification is > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > While admittedly it is (still) Dia de los Muertos > > > > > > > > > > today, I would > > > > > > > think > > > > > > > > > > that someone who resurrected a part of the NFSv3 > > > > > > > > > > spec > > > > > > > > > > that has been > > > > > > > > > > unused for the full 23 years of its existence might > > > > > > > > > > have some > > > > > > > > > > explanation for why they did so? > > > > > > > > > > > > > > > > > > > > IOW: not being of a particularly religious > > > > > > > > > > persuasion, > > > > > > > > > > I usually want > > > > > > > > > > to understand why features are needed rather than > > > > > > > > > > having blind faith > > > > > > > in > > > > > > > > > > the person who wrote the spec. > > > > > > > > > > > > > > > > > > > > > Matt > > > > > > > > > > > > > > > > > > > > > > On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust < > > > > > > > > > > > trondmy@xxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > On Fri, 2018-11-02 at 21:24 +0530, Malahal > > > > > > > > > > > > Naineni > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Ben, NFSv3 RFC1813.txt states: "If two file > > > > > > > > > > > > > handles from the > > > > > > > same > > > > > > > > > > > > > server are equal, they must refer to the same > > > > > > > > > > > > > file, but if > > > > > > > > > > > > > they are > > > > > > > > > > > > > not equal, no conclusions can be drawn." > > > > > > > > > > > > > Ganesha > > > > > > > > > > > > > does return > > > > > > > same > > > > > > > > > > > > > fileid here (inode). > > > > > > > > > > > > > > > > > > > > > > > > > > In NFSv4, they have introduced > > > > > > > > > > > > > "unique_handles" > > > > > > > > > > > > > attribute. I > > > > > > > > > > > > > don't > > > > > > > > > > > > > see > > > > > > > > > > > > > Linux NFS client using this at all though. > > > > > > > > > > > > > > > > > > > > > > > > Why does your server need to have multiple > > > > > > > > > > > > filehandles refer to > > > > > > > the > > > > > > > > > > > > same file, and why do you expect clients to > > > > > > > > > > > > support > > > > > > > > > > > > this? > > > > > > > > > > > > > > > > > > > > > > > > Yes, the spec allows it, but that's not a > > > > > > > > > > > > sufficient reason. > > > > > > > > > > > > > > > > > > > > > > > > > Regards, Malahal. > > > > > > > > > > > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin > > > > > > > > > > > > > Coddington < > > > > > > > > > > > > > bcodding@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > On 2 Nov 2018, at 1:26, Malahal Naineni > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi All, we are using NFS-Ganesha with > > > > > > > > > > > > > > > Linux > > > > > > > > > > > > > > > NFS clients. > > > > > > > The > > > > > > > > > > > > > > > client's > > > > > > > > > > > > > > > shell reports the following. Based on > > > > > > > > > > > > > > > lsof, > > > > > > > > > > > > > > > the directory > > > > > > > is > > > > > > > > > > > > > > > marked > > > > > > > > > > > > > > > deleted. "cd to ROOT and cd to the same > > > > > > > > > > > > > > > home > > > > > > > > > > > > > > > directory > > > > > > > fixes > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > issue. The client behaves as though the > > > > > > > > > > > > > > > directory is > > > > > > > deleted > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > recreated! Our NFS-Ganesha server > > > > > > > > > > > > > > > implementation uses > > > > > > > > > > > > > > > multiple > > > > > > > > > > > > > > > file > > > > > > > > > > > > > > > handles that point to the same object. > > > > > > > > > > > > > > > NFS > > > > > > > > > > > > > > > spec says this > > > > > > > > > > > > > > > should > > > > > > > > > > > > > > > be > > > > > > > > > > > > > > > fine, but Linux NFS seems to be broken in > > > > > > > > > > > > > > > this regard. > > > > > > > > > > > > > > > tcpdump > > > > > > > > > > > > > > > does > > > > > > > > > > > > > > > indicate file handle change (note that > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > file handles are > > > > > > > > > > > > > > > permanent, > > > > > > > > > > > > > > > meaning they are valid at the server any > > > > > > > > > > > > > > > time) around this > > > > > > > > > > > > > > > issue > > > > > > > > > > > > > > > time. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > "shell-init: error retrieving current > > > > > > > > > > > > > > > directory: getcwd: > > > > > > > > > > > > > > > cannot > > > > > > > > > > > > > > > access > > > > > > > > > > > > > > > parent directories: No such file or > > > > > > > > > > > > > > > directory" > > > > > > > > > > > > > > > sh 112544 malahal cwd > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > DIR > > > > > > > > > > > > > > > 0,67 > > > > > > > > > > > > > > > 65536 45605209 /home/malahal > > > > > > > > > > > > > > > (deleted) > > > > > > > > > > > > > > > (10.120.154.42:/nfs/malahal-export/) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Function nfs_prime_dcache() seems to > > > > > > > > > > > > > > > invalidate the dcache > > > > > > > > > > > > > > > entry > > > > > > > > > > > > > > > if > > > > > > > > > > > > > > > nfs_same_file() returns false. > > > > > > > > > > > > > > > nfs_same_file() does seem to > > > > > > > > > > > > > > > return > > > > > > > > > > > > > > > false with the following change, if I > > > > > > > > > > > > > > > read it > > > > > > > > > > > > > > > correctly, if > > > > > > > > > > > > > > > there > > > > > > > > > > > > > > > is a > > > > > > > > > > > > > > > file handle change. Can this be the > > > > > > > > > > > > > > > source of > > > > > > > > > > > > > > > my issue? It > > > > > > > > > > > > > > > seems > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > the client should do this only if the > > > > > > > > > > > > > > > file > > > > > > > > > > > > > > > handle is NOT > > > > > > > > > > > > > > > valid > > > > > > > > > > > > > > > (e.g. > > > > > > > > > > > > > > > if it gets ESTALE), right? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The following commit seems to assume that > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > objects are > > > > > > > > > > > > > > > different if > > > > > > > > > > > > > > > they have different file handles! > > > > > > > > > > > > > > > commit > > > > > > > > > > > > > > > 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 > > > > > > > > > > > > > > > Author: Trond Myklebust < > > > > > > > > > > > > > > > trond.myklebust@xxxxxxxxxxxxxxx> > > > > > > > > > > > > > > > Date: Thu Sep 22 13:38:52 2016 -0400 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > NFS: Fix inode corruption in > > > > > > > > > > > > > > > nfs_prime_dcache() > > > > > > > > > > > > > > > > > > > > > > > > > > > > My understanding is that for NFSv3 we have > > > > > > > > > > > > > > to > > > > > > > > > > > > > > assume that > > > > > > > > > > > > > > distinct > > > > > > > > > > > > > > filehandles are distinct objects, but maybe > > > > > > > > > > > > > > I'm > > > > > > > > > > > > > > wrong about > > > > > > > > > > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > For NFSv4.x, we can follow the guidance in > > > > > > > > > > > > > > RFCs > > > > > > > > > > > > > > 5661 or 7530 > > > > > > > > > > > > > > section 10.3.4 > > > > > > > > > > > > > > to determine if the differing filehandles > > > > > > > > > > > > > > are > > > > > > > > > > > > > > the same > > > > > > > object, > > > > > > > > > > > > > > specifically > > > > > > > > > > > > > > the fileid recommended attribute needs to > > > > > > > > > > > > > > be > > > > > > > > > > > > > > implemented. Is > > > > > > > > > > > > > > Ganesha > > > > > > > > > > > > > > returning the same fileid for both > > > > > > > > > > > > > > filehandles? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Ben > > > > > > > > > > > > -- > > > > > > > > > > > > Trond Myklebust > > > > > > > > > > > > CTO, Hammerspace Inc > > > > > > > > > > > > 4300 El Camino Real, Suite 105 > > > > > > > > > > > > Los Altos, CA 94022 > > > > > > > > > > > > www.hammer.space > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Trond Myklebust > > > > > > > > > > CTO, Hammerspace Inc > > > > > > > > > > 4300 El Camino Real, Suite 105 > > > > > > > > > > Los Altos, CA 94022 > > > > > > > > > > www.hammer.space > > > > > > > > > > > > > > > > > > > > > > > > > > > > [attachment "signature.asc" deleted by Marc > > > > > > > > Eshel/Almaden/IBM] > > Trond Myklebust > > CTO, Hammerspace Inc > > 4300 El Camino Real, Suite 105 > > Los Altos, CA 94022 > > www.hammer.space > > > > -- > > Trond Myklebust > > Linux NFS client maintainer, Hammerspace > > trond.myklebust@xxxxxxxxxxxxxxx -- Trond Myklebust CTO, Hammerspace Inc 4300 El Camino Real, Suite 105 Los Altos, CA 94022 www.hammer.space