Stray count increasing due to snapshots (?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have a production CephFS (13.2.6 Mimic) with >400K strays. I believe this is caused by snapshots. The backup process for this filesystem consists of creating a snapshot and rsyncing it over daily, and snapshots are kept locally in the FS for 2 months for backup and disaster recovery reasons.

As I understand it, any files deleted which still remain referenced from a snapshot end up being moved to the stray directories, right?

I've seen stories of problems once the stray count hits 1M (100k per stray subdirectory), so I'm worried about this possibly happening in the future as the data volume grows. AIUI dirfrags are enabled by default now, so I expect the stray directories to be fragmented too, but from what little documentation I can find, this does not seem to be the case.

rados -p cephfs_metadata listomapkeys 600.00000000 | wc -l
43014

The fragment is the '00000000' in the object name, right? If so, each stray subdir seems to be holding about 10% of the total strays in its first fragment, with no additional fragments. As I understand it, fragments should start to be created when the directory grows to over 10000 entries.

(aside: is there any good documentation about the on-RADOS data structures used by CephFS? I would like to get more familiar with everything to have a better chance of fixing problems should I run into some data corruption in the future)

--
Hector Martin (hector@xxxxxxxxxxxxxx)
Public Key: https://mrcn.st/pub
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux