Re: [PATCH v10 3/8] sunrpc: create nfsd dir in rpc_pipefs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 23 Mar 2012 11:53:37 -0400
Jeff Layton <jlayton@xxxxxxxxxx> wrote:

> On Fri, 23 Mar 2012 15:34:21 +0000
> "Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> wrote:
> 
> > On Fri, 2012-03-23 at 11:22 -0400, J. Bruce Fields wrote:
> > > On Fri, Mar 23, 2012 at 03:20:21PM +0000, Myklebust, Trond wrote:
> > > > On Fri, 2012-03-23 at 09:31 -0400, J. Bruce Fields wrote:
> > > > > On Fri, Mar 23, 2012 at 08:12:08AM -0400, J. Bruce Fields wrote:
> > > > > > On Wed, Mar 21, 2012 at 09:52:04AM -0400, Jeff Layton wrote:
> > > > > > > Add a new top-level dir in rpc_pipefs to hold the pipe for the clientid
> > > > > > > upcall.
> > > > > > 
> > > > > > After applying this patch, my tests consistently hang.  The hang happens
> > > > > > in excltest (of the special connectaton tests), over nfs4.1 and krb5.
> > > > > > Looking at the wire traffic, I'm seeing DELAY returned from a setattr
> > > > > > for mode on a newly-created (with EXCLUSIVE4_1) file.  That open got a
> > > > > > delegation, so presumably that's what's causing the DELAY, though I'm
> > > > > > not seeing the server send a recall.  That could be a krb5 bug.
> > > > > > 
> > > > > > Whatever bug there is here, it's hard to tell why this patch in
> > > > > > particular would make it more likely.
> > > > > > 
> > > > > > So, still investigating!
> > > > > 
> > > > > Reproduceable by:
> > > > > 
> > > > > 	mount -osec=krb5,minorversion=1 server:/export/ /mnt/
> > > > > 	cp cthon04/special/excltest /mnt/
> > > > > 	cd /mnt
> > > > > 	./excltest
> > > > 
> > > > Umm... When would you ever get a DELAY in the above scenario? I can see
> > > > getting an NFS4ERR_OPENMODE, but not DELAY.
> > > 
> > > There's a setattr for mode right after the open.  Is that unexpected?
> > 
> > Well yes, it is. The NFSv4.1 exclusive open should always be sending a
> > full set of attributes as part of the OPEN operation. The session replay
> > cache is now supposed to guarantee the only-once semantics that the
> > verifier used to provide.
> > 
> > > The server doesn't really have to recall the delegation in that case (it
> > > only needs to recall *other* clients' delegations) but I don't think
> > > it's wrong to.
> > 
> > Then why isn't it allowing the operation? Any sane client would normally
> > interpret NFS4ERR_DELAY to mean that the server is doing something to
> > fix whatever situation is preventing the operation from completing
> > (presumably by recalling delegations in this case). Just replying DELAY
> > and doing nothing is not helpful...
> > 
> 
> Yeah, this seems like a clear bug in the server code. I think it's
> replying DELAY in order to recall the delegation, but the delegation
> isn't getting recalled for some reason. We arguably don't actually need
> to recall it here, but I don't see any recall go out at all either...
> 
> As to why this patch seems to uncover this bug -- that's a complete
> mystery at this point...
> 

...and contrary to what Bruce has seen, I can also reproduce this when
the server is running a stock (unpatched) 3.3.0 kernel from the Fedora
rawhide repos.

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux