Re: [RFC PATCH 10/11] ceph: perform asynchronous unlink if we have sufficient caps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 10, 2019 at 2:11 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>
> On Wed, Apr 10, 2019 at 4:21 PM Patrick Donnelly <pdonnell@xxxxxxxxxx> wrote:
> >
> > On Wed, Apr 10, 2019 at 7:21 AM Jeff Layton <jlayton@xxxxxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > holding caps for request may cause deadlock.  For example
> > > > > >
> > > > > > - client hold Fx caps and send unlink request
> > > > > > - mds process request from other client, it change filelock's state to
> > > > > > EXCL_FOO and revoke Fx caps
> > > > > > - mds receives the unlink request, it can't process it because it
> > > > > > can't acquire wrlock on filelock
> > > > > >
> > > > > > filelock state stays in EXCL_FOO because client does not release Fx caps.
> > > > > >
> > > > >
> > > > > The client doing the unlink may have received a revoke for Fx on the
> > > > > dir at that point, but it won't have returned it yet. Shouldn't it
> > > > > still be considered to hold Fx on the dir until that happens?
> > > > >
> > > >
> > > > Client should release the Fx. But there is a problem, mds process
> > > > other request first after it get the release of Fx
> > > >
> > >
> > > As I envisioned it, the client would hold a reference to Fx while the
> > > unlink is in flight, so it would not return Fx until after the unlink
> > > has gotten an unsafe reply.
> >
> > This was my understanding as well. It seems to me that the correct
> > thing to do is to move forward with the understanding that the client
> > has a write lock on the filelock state for the directory inode (for Fx
> > cap) and a write lock on the linklock for the file inode (for the Lx
> > cap). Obtaining those locks should require cap revocation which would
> > cause the client to flush its buffered async unlinks. Importantly --
> > and what actually needs to change (?): the MDS should skip acquiring
> > those locks because the client already has the appropriate caps.
> >
> > Does that work Zheng?
> >
>
> I'm not sure it will. IIUC...
>
> I think part of what Zheng is pointing out is that when we assume that
> the client already holds certain locks, then we are effectively
> changing the order in which they can be acquired. That can leave us
> subject to ABBA style deadlocks (though with all of the complexity
> that class Locker provides).
>
> That in and of itself wouldn't be a problem if the MDS code didn't
> wait synchronously on cap revokes in some cases (which Zheng pointed
> out). Fixing that latter bit seems like it might be a big win for
> parallelism, in addition to making async calls more possible.

Well it's not literally synchronous; it's just that the MDS holds on
to the locks it's already taken. That's why you can see failure cases
where the MDS is still running and resolving requests but there's a
particular client which is stuck with one operation that never moves
forward.

In the fusty cobwebs of my mind it suggests the way we'd deal with
this is by creating a NEW unstable lock state that the MDS transitions
into that revokes the exclusive cap but lets the MDS take all the same
local locks as it could before, then proceeds on to the stable state
when the client's cap revoke is actually completed. Although I thought
the FileLock state would be the one that could handle this case
already since it's by far the most involved of the options....?
-Greg



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux