We sync the file system without preserving hard links. But we take
snapshots after each sync, so I guess deleting files which are still in
snapshots can also be in the stray directories?
[root@mds02 ~]# ceph daemon mds.mds02 perf dump | grep -i 'stray\|purge'
"finisher-PurgeQueue": {
"num_strays": 990153,
"num_strays_delayed": 32,
"num_strays_enqueuing": 0,
"strays_created": 753278,
"strays_enqueued": 650603,
"strays_reintegrated": 0,
"strays_migrated": 0,
num_strays is indeed close to a million
On 10/09/2019 12:42, Burkhard Linke wrote:
Hi,
do you use hard links in your workload? The 'no space left on device'
message may also refer to too many stray files. Strays are either
files that are to be deleted (e.g. the purge queue), but also files
which are deleted, but hard links are still pointing to the same
content. Since cephfs does not use an indirect layer between inodes
and data, and the data chunks are named after the inode id, removing
the original file will leave stray entries since cephfs is not able to
rename the underlying rados objects.
There are 10 hidden directories for stray files, and given a maximum
size of 100.000 entries you can store only up to 1 million entries. I
don't know exactly how entries are distributed among the 10
directories, so the limit may be reached earlier for a single stray
directory. The performance counters contains some values for stray, so
they are easy to check. The daemonperf output also shows the current
value.
The problem of the upper limit of directory entries was solved by
directory fragmentation, so you should check whether fragmentation is
allowed in your filesystem. You can also try to increase the upper
directory entry limit, but this might lead to other problems (too
large rados omap objects....).
Regards,
Burkhard
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx