Re: MDS / CephFS behaviour with unusual directory layout

Stefan Kooman <stefan@xxxxxx> · Wed, 2 Oct 2019 07:41:47 +0200

Quoting Stefan Kooman (stefan@xxxxxx):
> Hi List,
> 
> We are planning to move a filesystem workload (currently nfs) to CephFS.
> It's around 29 TB. The unusual thing here is the amount of directories
> in use to host the files. In order to combat a "too many files in one
> directory" scenario a "let's make use of recursive directories" approach.
> Not ideal either. This workload is supposed to be moved to (Ceph) S3
> sometime in the future, but until then, it has to go to a shared
> filesystem ...
> 
> So what is unusual about this? The directory layout looks like this
> 
> /data/files/00/00/[0-8][0-9]/[0-9]/ from this point on there will be 7
> directories created to store 1 file.
> 
> Total amount of directories in a file path is 14. There are around 150 M
> files in 400 M directories.
> 
> The working set won't be big. Most files will just sit around and will
> not be touched. The active amount of files wil be a few thousand.
> 
> We are wondering if this kind of directory structure is suitable for
> CephFS. Might the MDS get difficulties with keeping up that many inodes
> / dentries or doesn't it care at all?
> 
> The amount of metadata overhead might be horrible, but we will test that
> out.

This awkward dataset is "live" ... and the MDS has been happily
crunching away so far. Peaking at 42.5 M caps. Multiple parallel rsyncs
(20+) to fill cephfs was no issue whatsover.

Thanks Nathan Fish and Burkhard Linke for sharing helpful MDS insight!

Gr. Stefan

-- 
| BIT BV  https://www.bit.nl/        Kamer van Koophandel 09090351
| GPG: 0xD14839C6                   +31 318 648 688 / info@xxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com