On Sat, Sep 14, 2019 at 8:57 PM Hector Martin <hector@xxxxxxxxxxxxxx> wrote: > > On 13/09/2019 16.25, Hector Martin wrote: > > Is this expected for CephFS? I know data deletions are asynchronous, but > > not being able to delete metadata/directories without an undue impact on > > the whole filesystem performance is somewhat problematic. > > I think I'm getting a feeling for who the culprit is here. I just > noticed that listing directories in a snapshot that were subsequently > deleted *also* performs horribly, and kills cluster performance too. > > We just had a partial outage due to this; a snapshot+rsync triggered > while a round of deletions were happening, and as far as I can tell, > when it caught up to newly deleted files, MDS performance tanked as it > repeatedly had to open stray dirs under the hood. In fact, the > inode/dentry metrics (opened/closed) skyrocketed during that period, > from the normal ~1Kops from multiple parallel rsyncs to ~15Kops. > > As I mentioned in a prior message to the list, we have ~570k stray files > due to snapshots. It makes sense that deleting a directory/file means > moving it to a stray directory (each holding ~57k files already), and > accessing a deleted file via a snapshot means accessing the stray > directory. Am I right in thinking that these operations are at least > O(n) in the amount of strays, and in fact may iterate or otherwise touch > every single file in the stray directories? (This would explain the > sudden 15Kops spike in indoe/dentry activity). It seems that with such > bloated stray dirs, anything that involves them under the scenes just > make the MDS completely hiccup and grind away, affecting performance for > any other clients. > > I guess at this point we'll have to drastically cut down the time span > for which we keep CephFS snapshots. Maybe I'll move the snapshot history > keeping to the backup target, at least then it won't affect production > data. But since we plan on using the other cluster for production too > eventually, that would mean we need to use multi-FS in order to isolate > the workloads... > when a snapshoted directory is deleted, mds moves the directory into to stray directory. You have 57k strays, each time mds have a cache miss for stray, mds needs to load a stray dirfrag. This is very inefficient because a stray dirfrag contains lots of items, most items are useless. > -- > Hector Martin (hector@xxxxxxxxxxxxxx) > Public Key: https://mrcn.st/pub > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com