Re: "(deleted)" directories

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2018-11-05 at 13:10 +0530, Malahal Naineni wrote:
> > > My reading of section 10.3.4 of RFC7530 suggests that the client
> > > should
> generally compare fsid and fileid to see if two different filehandles
> refer to
> the same object or not.

Except that is wrong. As I already said in my previous email, there are
servers out there in the field that are happy to serve up snapshots
that have the exact same fsid and fileid as the original files. NetApp
will for instance happily do this by default unless you explicitly
configure it not to.

> 
> Section 10.3.4 is for files only correct? The issue here is for
> directories. Also, Trond clearly pointed that Linux breaks section
> 10.3.4 from his email stating "We treat always different filehandles
> as if they refer to different
> files. It has long been the case that snapshots from several vendors
> are encoded to look like the same file (same fileid + same  fsid) and
> differing only by filehandle. If we were to try to consolidate those
> inodes we would end up corrupting application data."
> 
> We don't respect either NFSv3 or NFSv4 RFCs in this regard!

While RFC7530 does have section 10.3.4 that describes "a reliable
method to determine whether two distinct filehandles represent distinct
objects", as long as server vendors are shipping product that violates
it, then that entire section is a moot point.

BTW: Note also how the same section reminds server vendors that "For
NFSv3 clients, the typical practice has been to assume for the purpose
of caching that distinct filehandles represent distinct file system
objects."

However, even if a client were to follow Section 10.3.4, then Section
9.1.4 states that any open/lock/delegation stateid is associated with a
_single filehandle_, and that the lock state it carries is not allowed
to be consolidated per file or fileid. See also Section 9.11, which
more explicitly describes how to treat the multiple filehandle case.

So while NFSv4 theoretically allows for the behaviour you are asking
for, it is not particularly practical to implement, and as I said, the
entire Section 10.3.4 is undermined by existing server implementations.

> Regards, Malahal.
> 
> 
> Regards, Malahal.
> On Mon, Nov 5, 2018 at 10:39 AM NeilBrown <neilb@xxxxxxxx> wrote:
> > On Mon, Nov 05 2018, Malahal Naineni wrote:
> > 
> > > > Do we know exactly why the FH changed in this particular
> > > > circumstance?
> > > 
> > > In this instance, this is due to a code bug but obviously, there
> > > are
> > > legitimate cases where this occur with Ganesha.
> > 
> > Good to know that bug has been found, and presumably fixed.
> > It is not obvious to me that there are any such legitimate cases
> > for
> > directories.
> > 
> > > > (I'm particularly thinking of volatile file handles).
> > > 
> > > NFS4 RFC has "unique filehandles" concept as well. Linux NFS
> > > client
> > > doesn't seem to use "unique filehandles" attribute as well.
> > 
> > A client doesn't need to use that attribute.
> > My reading of section 10.3.4 of RFC7530 suggests that the client
> > should
> > generally compare fsid and fileid to see if two different
> > filehandles refer to
> > the same object or not.
> > If unique_handles is known to be set for a given fsid, then
> > different
> > filehandles imply different files, without bothering to check the
> > fileid.
> > So the use of unique_handles is an optimization.
> > 
> > I haven't looked at the Linux/NFS code to see if it conforms to
> > 10.3.4.
> > 
> > NeilBrown
> > 
> > 
> > > On Mon, Nov 5, 2018 at 6:02 AM NeilBrown <neilb@xxxxxxxx> wrote:
> > > > On Sun, Nov 04 2018, Marc Eshel wrote:
> > > > 
> > > > > linux-nfs-owner@xxxxxxxxxxxxxxx wrote on 11/03/2018 10:31:29
> > > > > PM:
> > > > > 
> > > > > > From: NeilBrown <neilb@xxxxxxxx>
> > > > > > To: Marc Eshel <eshel@xxxxxxxxxx>, Trond Myklebust
> > > > > <trondmy@xxxxxxxxxxxxxxx>
> > > > > > Cc: "bcodding\@redhat.com" <bcodding@xxxxxxxxxx>, "linux-
> > > > > > nfs
> > > > > > \@vger.kernel.org" <linux-nfs@xxxxxxxxxxxxxxx>, linux-nfs-
> > > > > > owner@xxxxxxxxxxxxxxx, "malahal\@gmail.com" <
> > > > > > malahal@xxxxxxxxx>,
> > > > > > "mbenjami\@redhat.com" <mbenjami@xxxxxxxxxx>
> > > > > > Date: 11/03/2018 10:41 PM
> > > > > > Subject: Re: "(deleted)" directories
> > > > > > Sent by: linux-nfs-owner@xxxxxxxxxxxxxxx
> > > > > > 
> > > > > > On Fri, Nov 02 2018, Marc Eshel wrote:
> > > > > > 
> > > > > > > One reason to have different FHs for the same file is
> > > > > > > that a file can
> > > > > be
> > > > > > > linked from multiple directories.
> > > > > > 
> > > > > > This has some based when considering filehandles for non-
> > > > > > directories.
> > > > > > However the original problem was with filehandles for
> > > > > > directories.....
> > > > > 
> > > > > This was just an example of why FH might be different, I
> > > > > don't think we
> > > > > depend on it for the parent information anymore. Malahal
> > > > > listed some other
> > > > > reasons for having different FH for the same file. I believe
> > > > > that Ganesha
> > > > > split the FH to the key portion (the unique id of the file)
> > > > > and some other
> > > > > information that is file system dependent. If the NFS client
> > > > > can not
> > > > > handle the spec definition of FH maybe the spec should be
> > > > > updated to
> > > > > something like Ganesha does.
> > > > > Marc.
> > > > 
> > > > Do we know exactly why the FH changed in this particular
> > > > circumstance?
> > > > Is there some way to find out?
> > > > 
> > > > The NFSv3 spec has been updated - it is called "NFSv4" (now
> > > > 4.2).  It
> > > > says a lot more things about filehandles, but even there, the
> > > > spec is
> > > > only as good as the what has been implemented and tested.  I'm
> > > > pretty
> > > > sure that there are parts of the FH spec that have never been
> > > > put into
> > > > practice - so using them would not be wise (I'm particularly
> > > > thinking of
> > > > volatile file handles).
> > > > 
> > > > For better or worse, Linux requires directories to have stable
> > > > filehandles for NFSv3.  This requirement is effectively imposed
> > > > by the
> > > > dcache.  If there were some way to reliably check if two
> > > > filehandles
> > > > referred to the same directory, then we could relax that
> > > > restriction,
> > > > but I don't think there is.
> > > > 
> > > > I think the other possible reason mentioned for changing the
> > > > filehandle
> > > > is to support migration.  NFSv3 definitely doesn't support
> > > > migration.
> > > > NFSv4 explicitly tries to.
> > > > 
> > > > NeilBrown
> > > > 
> > > > 
> > > > > > > Adding the parent inode to the FH help finding the the
> > > > > > > name of the
> > > > > file by
> > > > > > > looking for the file inode in
> > > > > > > the parent directoy.
> > > > > > > 
> > > > > > 
> > > > > > ....and directories have a ".." link, obviating the need to
> > > > > > store parent
> > > > > > information in the filehandle.
> > > > > > 
> > > > > > NeilBrown
> > > > > > 
> > > > > > 
> > > > > > > Marc.
> > > > > > > 
> > > > > > > linux-nfs-owner@xxxxxxxxxxxxxxx wrote on 11/02/2018
> > > > > > > 05:15:42 PM:
> > > > > > > 
> > > > > > > > From: Trond Myklebust <trondmy@xxxxxxxxxxxxxxx>
> > > > > > > > To: "mbenjami@xxxxxxxxxx" <mbenjami@xxxxxxxxxx>
> > > > > > > > Cc: "bcodding@xxxxxxxxxx" <bcodding@xxxxxxxxxx>, "
> > > > > > > > malahal@xxxxxxxxx"
> > > > > > > > <malahal@xxxxxxxxx>, "linux-nfs@xxxxxxxxxxxxxxx"
> > > > > > > <linux-nfs@xxxxxxxxxxxxxxx>
> > > > > > > > Date: 11/02/2018 05:15 PM
> > > > > > > > Subject: Re: "(deleted)" directories
> > > > > > > > Sent by: linux-nfs-owner@xxxxxxxxxxxxxxx
> > > > > > > > 
> > > > > > > > On Fri, 2018-11-02 at 18:07 -0400, Matt Benjamin wrote:
> > > > > > > > > It sounds like a pretty good one, that goes to the
> > > > > > > > > heart of what a
> > > > > > > > > specification is
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > While admittedly it is (still) Dia de los Muertos
> > > > > > > > today, I would
> > > > > think
> > > > > > > > that someone who resurrected a part of the NFSv3 spec
> > > > > > > > that has been
> > > > > > > > unused for the full 23 years of its existence might
> > > > > > > > have some
> > > > > > > > explanation for why they did so?
> > > > > > > > 
> > > > > > > > IOW: not being of a particularly religious persuasion,
> > > > > > > > I usually want
> > > > > > > > to understand why features are needed rather than
> > > > > > > > having blind faith
> > > > > in
> > > > > > > > the person who wrote the spec.
> > > > > > > > 
> > > > > > > > > Matt
> > > > > > > > > 
> > > > > > > > > On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust <
> > > > > > > > > trondmy@xxxxxxxxxxxxxxx> wrote:
> > > > > > > > > > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni
> > > > > > > > > > wrote:
> > > > > > > > > > > Ben, NFSv3 RFC1813.txt states: "If two file
> > > > > > > > > > > handles from the
> > > > > same
> > > > > > > > > > > server are equal, they must refer to the same
> > > > > > > > > > > file, but if
> > > > > > > > > > > they  are
> > > > > > > > > > > not equal, no conclusions can be drawn." Ganesha
> > > > > > > > > > > does return
> > > > > same
> > > > > > > > > > > fileid here (inode).
> > > > > > > > > > > 
> > > > > > > > > > > In NFSv4, they have introduced "unique_handles"
> > > > > > > > > > > attribute. I
> > > > > > > > > > > don't
> > > > > > > > > > > see
> > > > > > > > > > > Linux NFS client using this at all though.
> > > > > > > > > > 
> > > > > > > > > > Why does your server need to have multiple
> > > > > > > > > > filehandles refer to
> > > > > the
> > > > > > > > > > same file, and why do you expect clients to support
> > > > > > > > > > this?
> > > > > > > > > > 
> > > > > > > > > > Yes, the spec allows it, but that's not a
> > > > > > > > > > sufficient reason.
> > > > > > > > > > 
> > > > > > > > > > > Regards, Malahal.
> > > > > > > > > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin
> > > > > > > > > > > Coddington <
> > > > > > > > > > > bcodding@xxxxxxxxxx> wrote:
> > > > > > > > > > > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote:
> > > > > > > > > > > > 
> > > > > > > > > > > > > Hi All, we are using NFS-Ganesha with Linux
> > > > > > > > > > > > > NFS clients.
> > > > > The
> > > > > > > > > > > > > client's
> > > > > > > > > > > > > shell reports the following. Based on lsof,
> > > > > > > > > > > > > the directory
> > > > > is
> > > > > > > > > > > > > marked
> > > > > > > > > > > > > deleted. "cd to ROOT and cd to the same home
> > > > > > > > > > > > > directory
> > > > > fixes
> > > > > > > > > > > > > the
> > > > > > > > > > > > > issue. The client behaves as though the
> > > > > > > > > > > > > directory is
> > > > > deleted
> > > > > > > > > > > > > and
> > > > > > > > > > > > > recreated! Our NFS-Ganesha server
> > > > > > > > > > > > > implementation uses
> > > > > > > > > > > > > multiple
> > > > > > > > > > > > > file
> > > > > > > > > > > > > handles that point to the same object. NFS
> > > > > > > > > > > > > spec says this
> > > > > > > > > > > > > should
> > > > > > > > > > > > > be
> > > > > > > > > > > > > fine, but Linux NFS seems to be broken in
> > > > > > > > > > > > > this regard.
> > > > > > > > > > > > > tcpdump
> > > > > > > > > > > > > does
> > > > > > > > > > > > > indicate file handle change (note that all
> > > > > > > > > > > > > file handles are
> > > > > > > > > > > > > permanent,
> > > > > > > > > > > > > meaning they are valid at the server any
> > > > > > > > > > > > > time) around this
> > > > > > > > > > > > > issue
> > > > > > > > > > > > > time.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > "shell-init: error retrieving current
> > > > > > > > > > > > > directory: getcwd:
> > > > > > > > > > > > > cannot
> > > > > > > > > > > > > access
> > > > > > > > > > > > > parent directories: No such file or
> > > > > > > > > > > > > directory"
> > > > > > > > > > > > > sh        112544            malahal  cwd     
> > > > > > > > > > > > >   DIR
> > > > > > > > > > > > > 0,67
> > > > > > > > > > > > >      65536   45605209 /home/malahal (deleted)
> > > > > > > > > > > > > (10.120.154.42:/nfs/malahal-export/)
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Function nfs_prime_dcache() seems to
> > > > > > > > > > > > > invalidate the dcache
> > > > > > > > > > > > > entry
> > > > > > > > > > > > > if
> > > > > > > > > > > > > nfs_same_file() returns false.
> > > > > > > > > > > > > nfs_same_file() does seem to
> > > > > > > > > > > > > return
> > > > > > > > > > > > > false with the following change, if I read it
> > > > > > > > > > > > > correctly, if
> > > > > > > > > > > > > there
> > > > > > > > > > > > > is a
> > > > > > > > > > > > > file handle change. Can this be the source of
> > > > > > > > > > > > > my issue? It
> > > > > > > > > > > > > seems
> > > > > > > > > > > > > that
> > > > > > > > > > > > > the client should do this only if the file
> > > > > > > > > > > > > handle is NOT
> > > > > > > > > > > > > valid
> > > > > > > > > > > > > (e.g.
> > > > > > > > > > > > > if it gets ESTALE), right?
> > > > > > > > > > > > > 
> > > > > > > > > > > > > The following commit seems to assume that the
> > > > > > > > > > > > > objects are
> > > > > > > > > > > > > different if
> > > > > > > > > > > > > they have different file handles!
> > > > > > > > > > > > > commit
> > > > > > > > > > > > > 7dc72d5f7a0ec97a53e126c46e2cbd2560757955
> > > > > > > > > > > > > Author: Trond Myklebust <
> > > > > > > > > > > > > trond.myklebust@xxxxxxxxxxxxxxx>
> > > > > > > > > > > > > Date:   Thu Sep 22 13:38:52 2016 -0400
> > > > > > > > > > > > > 
> > > > > > > > > > > > >     NFS: Fix inode corruption in
> > > > > > > > > > > > > nfs_prime_dcache()
> > > > > > > > > > > > 
> > > > > > > > > > > > My understanding is that for NFSv3 we have to
> > > > > > > > > > > > assume that
> > > > > > > > > > > > distinct
> > > > > > > > > > > > filehandles are distinct objects, but maybe I'm
> > > > > > > > > > > > wrong about
> > > > > > > > > > > > this.
> > > > > > > > > > > > 
> > > > > > > > > > > > For NFSv4.x, we can follow the guidance in RFCs
> > > > > > > > > > > > 5661 or 7530
> > > > > > > > > > > > section 10.3.4
> > > > > > > > > > > > to determine if the differing filehandles are
> > > > > > > > > > > > the same
> > > > > object,
> > > > > > > > > > > > specifically
> > > > > > > > > > > > the fileid recommended attribute needs to be
> > > > > > > > > > > > implemented.  Is
> > > > > > > > > > > > Ganesha
> > > > > > > > > > > > returning the same fileid for both filehandles?
> > > > > > > > > > > > 
> > > > > > > > > > > > Ben
> > > > > > > > > > --
> > > > > > > > > > Trond Myklebust
> > > > > > > > > > CTO, Hammerspace Inc
> > > > > > > > > > 4300 El Camino Real, Suite 105
> > > > > > > > > > Los Altos, CA 94022
> > > > > > > > > > www.hammer.space
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > --
> > > > > > > > Trond Myklebust
> > > > > > > > CTO, Hammerspace Inc
> > > > > > > > 4300 El Camino Real, Suite 105
> > > > > > > > Los Altos, CA 94022
> > > > > > > > www.hammer.space
> > > > > > > > 
> > > > > > > > 
> > > > > > [attachment "signature.asc" deleted by Marc
> > > > > > Eshel/Almaden/IBM]
Trond Myklebust
CTO, Hammerspace Inc
4300 El Camino Real, Suite 105
Los Altos, CA 94022
www.hammer.space

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux