On Wed, Feb 26, 2014 at 6:10 AM, David Champion <dgc@xxxxxxxxxxxx> wrote: > I have a 1.6 TB collecton of 8 million files in CephFS, distributed up > to 8-10 directories deep. (Never mind why - this design decision is out > of my hands and not in scope.) I need to expose this data on multiple > application servers. For the sake of argument, let's say I'm exposing > files from CephFS via Apache. (This is true but it's not the only use > case.) > > This is pretty slow to search, but so be it. However, if someone wants > to copy the entire body of files, that's something I should be able to > speed up. I see two options: > > 1. Place the 8m files in a disk image. Mount the disk image (read-only) > to provide access to the 8m files, and allow copying the disk image to > accelerate read of the enture dataset. > > 2. Put the 8m files in an RBD, and mount that instead. I guess if it's > RO I can map it to multiple heads -- true? Should be fine. > > Questions: > > q1. CephFS has a tunable for max file size, currently set to 1TB. If > I want to change this, what needs to be done or redone? Do I have to > rebuild, or can I just change the param, restart services, and be off? What version are you running? It varies on whether that's set at FS creation time or (on new enough code, but I don't remember when off-hand) can be set via the cli ("ceph mds set max_file_size <size_in_bytes>". > q2. Sounds fine, except then the only access to the RBD raw blocks is > via the block dev in /dev. I expose the CephFS mount to users, but not > /dev. Is there a way to map the RBD as a pseudo-file within the CephFS > mount? If not, then perhaps I'm looking at a bind/loopback mount of > /dev/rbd/rbd into the user-visible namespace? No, definitely no mapping of RBD into CephFS. It's a completely different data format. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com