I have a 1.6 TB collecton of 8 million files in CephFS, distributed up to 8-10 directories deep. (Never mind why - this design decision is out of my hands and not in scope.) I need to expose this data on multiple application servers. For the sake of argument, let's say I'm exposing files from CephFS via Apache. (This is true but it's not the only use case.) This is pretty slow to search, but so be it. However, if someone wants to copy the entire body of files, that's something I should be able to speed up. I see two options: 1. Place the 8m files in a disk image. Mount the disk image (read-only) to provide access to the 8m files, and allow copying the disk image to accelerate read of the enture dataset. 2. Put the 8m files in an RBD, and mount that instead. I guess if it's RO I can map it to multiple heads -- true? Questions: q1. CephFS has a tunable for max file size, currently set to 1TB. If I want to change this, what needs to be done or redone? Do I have to rebuild, or can I just change the param, restart services, and be off? q2. Sounds fine, except then the only access to the RBD raw blocks is via the block dev in /dev. I expose the CephFS mount to users, but not /dev. Is there a way to map the RBD as a pseudo-file within the CephFS mount? If not, then perhaps I'm looking at a bind/loopback mount of /dev/rbd/rbd into the user-visible namespace? Thanks for information and suggestions. -- David Champion • dgc@xxxxxxxxxxxx • University of Chicago Enrico Fermi Institute • Computation Institute • USATLAS Midwest Tier 2 OSG Connect • CI Connect _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com