Hi Andras, On Wed, Nov 3, 2021 at 10:18 AM Andras Pataki <apataki@xxxxxxxxxxxxxxxxxxxxx> wrote: > > Hi cephers, > > Recently we've started using cephfs snapshots more - and seem to be > running into a rather annoying performance issue with the MDS. The > cluster in question is on Nautilus 14.2.20. > > Typically, the MDS processes a few thousand requests per second with all > operations showing latencies in the few millisecond range (in mds perf > dump) and the system seems quite responsive (directory listings, general > metadata operations feel quick). Every so often, the MDS transitions > into a period of super high latency: 0.1 to 2 seconds per operation (as > measured by increases in the latency counters in mds perf dump). During > these high latency periods, the request rate is about the same (low > 1000s requests/s) - but one thread of the MDS called 'fn_anonymous' is > 100% busy. Pointing the debugger to it and getting a stack trace at > random times always shows a similar picture: Thanks for the report and useful stack trace. This is probably corrected by the new use of a "fair" mutex in the MDS: https://tracker.ceph.com/issues/52441 The fix will be in 16.2.7. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx