Re: CephFS deletion performance

Hector Martin <hector@xxxxxxxxxxxxxx> · Sat, 14 Sep 2019 18:19:09 +0900

On 13/09/2019 16.25, Hector Martin wrote:
> Is this expected for CephFS? I know data deletions are asynchronous, but 
> not being able to delete metadata/directories without an undue impact on 
> the whole filesystem performance is somewhat problematic.

I think I'm getting a feeling for who the culprit is here. I just
noticed that listing directories in a snapshot that were subsequently
deleted *also* performs horribly, and kills cluster performance too.

We just had a partial outage due to this; a snapshot+rsync triggered
while a round of deletions were happening, and as far as I can tell,
when it caught up to newly deleted files, MDS performance tanked as it
repeatedly had to open stray dirs under the hood. In fact, the
inode/dentry metrics (opened/closed) skyrocketed during that period,
from the normal ~1Kops from multiple parallel rsyncs to ~15Kops.

As I mentioned in a prior message to the list, we have ~570k stray files
due to snapshots. It makes sense that deleting a directory/file means
moving it to a stray directory (each holding ~57k files already), and
accessing a deleted file via a snapshot means accessing the stray
directory. Am I right in thinking that these operations are at least
O(n) in the amount of strays, and in fact may iterate or otherwise touch
every single file in the stray directories? (This would explain the
sudden 15Kops spike in indoe/dentry activity). It seems that with such
bloated stray dirs, anything that involves them under the scenes just
make the MDS completely hiccup and grind away, affecting performance for
any other clients.

I guess at this point we'll have to drastically cut down the time span
for which we keep CephFS snapshots. Maybe I'll move the snapshot history
keeping to the backup target, at least then it won't affect production
data. But since we plan on using the other cluster for production too
eventually, that would mean we need to use multi-FS in order to isolate
the workloads...

-- 
Hector Martin (hector@xxxxxxxxxxxxxx)
Public Key: https://mrcn.st/pub
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com