Re: LOCK_SYNC_MIX state makes "getattr" operations extremely slow when there are lots of clients issue writes or reads to the same file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 12, 2018 at 6:06 PM, Xuehan Xu <xxhdx1985126@xxxxxxxxx> wrote:
>> I've been following this discussion casually and am a bit confused.
>> The Client will happily send off an explicit getattr request if it
>> doesn't have enough capabilities to answer it locally.
>>
>> Is the problem here that the MDS is not answering all pending
>> CEPH_MDS_OP_GETATTR requests in one go? (Which I suppose it doesn't
>> really have a way of doing, if they haven't all been processed into
>> interior pending locks — but I think they should have gotten there if
>> caps are being recalled?)
>> Or are the clients for some reason requesting capabilities instead of
>> the single getattr message?
>> -Greg
>
> Hi, Greg.
>
> In our case, the mds does answer each CEPH_MDS_OP_GETATTR request
> seperately, even when there caps are recalled. According to our mds
> log, when the caps are recalled, all of these CEPH_MDS_OP_GETATTR
> requests are added to the waiter queue, and when the filelock goes
> into a stable state, they would be dispatched to be reprocessed one by
> one. However, as there are writing clients that want "Fw", the very
> first CEPH_MDS_OP_GETATTR request of those in the waiter queue would,
> again, turn the filelock into LOCK_SYNC_MIX state which would blocked
> all remaining CEPH_MDS_OP_GETATTR requests to get processed.

Hmm, that makes some sense but is sad. I think to resolve this we’d
need the MDS to recognize “repetitions” of the same op type that can
be serviced in a single lock operation for essentially no extra
(locking) cost, but I’m not sure how we’d integrate that with the
capability locking going on.
I guess that when we do locking operations, if they are a single
request and don't involve giving the client caps, maybe we could stick
stick any requests that require the same set of locks on a shared data
structure? And then run through them all when we get granted locks and
reply to those requests? That may not involve any real fairness
tradeoffs, as long as we're careful to only do stuff that doesn't
require extra effort (beyond queueing up the message send).

But I haven't looked at the data structures enough lately to have any
idea if something like this is really feasible.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux