Quoting Stefan Kooman (stefan@xxxxxx): > Hi List, > > We are planning to move a filesystem workload (currently nfs) to CephFS. > It's around 29 TB. The unusual thing here is the amount of directories > in use to host the files. In order to combat a "too many files in one > directory" scenario a "let's make use of recursive directories" approach. > Not ideal either. This workload is supposed to be moved to (Ceph) S3 > sometime in the future, but until then, it has to go to a shared > filesystem ... > > So what is unusual about this? The directory layout looks like this > > /data/files/00/00/[0-8][0-9]/[0-9]/ from this point on there will be 7 > directories created to store 1 file. > > Total amount of directories in a file path is 14. There are around 150 M > files in 400 M directories. > > The working set won't be big. Most files will just sit around and will > not be touched. The active amount of files wil be a few thousand. > > We are wondering if this kind of directory structure is suitable for > CephFS. Might the MDS get difficulties with keeping up that many inodes > / dentries or doesn't it care at all? > > The amount of metadata overhead might be horrible, but we will test that > out. This awkward dataset is "live" ... and the MDS has been happily crunching away so far. Peaking at 42.5 M caps. Multiple parallel rsyncs (20+) to fill cephfs was no issue whatsover. Thanks Nathan Fish and Burkhard Linke for sharing helpful MDS insight! Gr. Stefan -- | BIT BV https://www.bit.nl/ Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com