On Thu, Sep 5, 2019 at 4:31 PM Hector Martin <hector@xxxxxxxxxxxxxx> wrote: > > I have a production CephFS (13.2.6 Mimic) with >400K strays. I believe > this is caused by snapshots. The backup process for this filesystem > consists of creating a snapshot and rsyncing it over daily, and > snapshots are kept locally in the FS for 2 months for backup and > disaster recovery reasons. > > As I understand it, any files deleted which still remain referenced from > a snapshot end up being moved to the stray directories, right? > yes > I've seen stories of problems once the stray count hits 1M (100k per > stray subdirectory), so I'm worried about this possibly happening in the > future as the data volume grows. AIUI dirfrags are enabled by default > now, so I expect the stray directories to be fragmented too, but from > what little documentation I can find, this does not seem to be the case. > > rados -p cephfs_metadata listomapkeys 600.00000000 | wc -l > 43014 > > The fragment is the '00000000' in the object name, right? If so, each > stray subdir seems to be holding about 10% of the total strays in its > first fragment, with no additional fragments. As I understand it, > fragments should start to be created when the directory grows to over > 10000 entries. > stray subdir never get fragmented in current implementation. > (aside: is there any good documentation about the on-RADOS data > structures used by CephFS? I would like to get more familiar with > everything to have a better chance of fixing problems should I run into > some data corruption in the future) > > -- > Hector Martin (hector@xxxxxxxxxxxxxx) > Public Key: https://mrcn.st/pub > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com