I have a production CephFS (13.2.6 Mimic) with >400K strays. I believe
this is caused by snapshots. The backup process for this filesystem
consists of creating a snapshot and rsyncing it over daily, and
snapshots are kept locally in the FS for 2 months for backup and
disaster recovery reasons.
As I understand it, any files deleted which still remain referenced from
a snapshot end up being moved to the stray directories, right?
I've seen stories of problems once the stray count hits 1M (100k per
stray subdirectory), so I'm worried about this possibly happening in the
future as the data volume grows. AIUI dirfrags are enabled by default
now, so I expect the stray directories to be fragmented too, but from
what little documentation I can find, this does not seem to be the case.
rados -p cephfs_metadata listomapkeys 600.00000000 | wc -l
43014
The fragment is the '00000000' in the object name, right? If so, each
stray subdir seems to be holding about 10% of the total strays in its
first fragment, with no additional fragments. As I understand it,
fragments should start to be created when the directory grows to over
10000 entries.
(aside: is there any good documentation about the on-RADOS data
structures used by CephFS? I would like to get more familiar with
everything to have a better chance of fixing problems should I run into
some data corruption in the future)
--
Hector Martin (hector@xxxxxxxxxxxxxx)
Public Key: https://mrcn.st/pub
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com