We have a cluster running CephFS with metadata on SSDs and data split
between SSDs and OSDs (main pool is on HDDs, some subtrees are on an SSD
pool).
We're seeing quite poor deletion performance, especially for
directories. It seems that previously empty directories are often
deleted quickly, but unlinkat() on any directory that used to contain
data often takes upwards of a second. Stracing a simple `rm -r`:
unlinkat(7, "dbox-Mails", AT_REMOVEDIR) = 0 <0.002668>
unlinkat(6, "INBOX", AT_REMOVEDIR) = 0 <2.045551>
unlinkat(7, "dbox-Mails", AT_REMOVEDIR) = 0 <0.005872>
unlinkat(6, "Trash", AT_REMOVEDIR) = 0 <1.918497>
unlinkat(7, "dbox-Mails", AT_REMOVEDIR) = 0 <0.012609>
unlinkat(6, "Spam", AT_REMOVEDIR) = 0 <1.743648>
unlinkat(7, "dbox-Mails", AT_REMOVEDIR) = 0 <0.016548>
unlinkat(6, "Sent", AT_REMOVEDIR) = 0 <2.295136>
unlinkat(5, "mailboxes", AT_REMOVEDIR) = 0 <0.735630>
unlinkat(4, "mdbox", AT_REMOVEDIR) = 0 <0.686786>
(all those dbox-Mails subdirectories are empty children of the
folder-name directories)
It also seems that these deletions have a huge impact on cluster
performance, across hosts. This is the global MDS op latency impact of
doing first 1, then 6 parallel 'rm -r' instances from a host that is
otherwise not doing anything else:
https://mrcn.st/t/Screenshot_20190913_161500.png
(I had to stop the 6-parallel run because it was completely trashing
cluster performance for live serving machines; I wound up with load
average >900 on one of them).
The OSD SSDs/HDDs are not significantly busier during the deletions, nor
is CPU usage on the MDS much at that time, so I'm not sure what the
bottleneck is here.
Is this expected for CephFS? I know data deletions are asynchronous, but
not being able to delete metadata/directories without an undue impact on
the whole filesystem performance is somewhat problematic.
--
Hector Martin (hector@xxxxxxxxxxxxxx)
Public Key: https://mrcn.st/pub
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com