Re: mimic: MDS standby-replay causing blocked ops (MDS bug?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Frank Schilder (frans@xxxxxx):
> Dear Stefan,
> 
> thanks for the fast reply. We encountered the problem again, this time in a much simpler situation; please see below. However, let me start with your questions first:
> 
> What bug? -- In a single-active MDS set-up, should there ever occur an operation with "op_name": "fragmentdir"?

Yes, see http://docs.ceph.com/docs/mimic/cephfs/dirfrags/. If you would
have multiple active MDS the load could be shared among those.

There are some parameters that might need to be tuned in your
environment. But Zheng Yan is an expert in this matter, so maybe after
analysis of the mds dump cache it might reveal what is the culprit.

> Upgrading: The problem described here is the only issue we observe.
> Unless the problem is fixed upstream, upgrading won't help us and
> would be a bit of a waste of time. If someone can confirm that this
> problem is fixed in a newer version, we will do it. Otherwise, we
> might prefer to wait until it is.

Keeping your systems up to date generally improves stability. You might
prevent hitting issues when your workload changes in the future. First
testing new releases on a test system is recommended though.

> 
> News on the problem. We encountered it again when one of our users executed a command in parallel with pdsh on all our ~500 client nodes. This command accesses the same file from all these nodes pretty much simultaneously. We did this quite often in the past, but this time, the command got stuck and we started observing the MDS health problem again. Symptoms:

This command, does that incur writes, reads or a combination of both on
files in this directory? I wonder if you might prevent this from
happening when tuning "Activity thresholds". Especially when you say it
is load (# clients) dependend.

Gr. Stefan

-- 
| BIT BV  http://www.bit.nl/        Kamer van Koophandel 09090351
| GPG: 0xD14839C6                   +31 318 648 688 / info@xxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux