On 2/15/16 9:35 PM, Dave Chinner wrote: > On Mon, Feb 15, 2016 at 04:18:28PM +0100, David Casier wrote: >> Hi Dave, >> 1TB is very wide for SSD. > > It fills from the bottom, so you don't need 1TB to make it work > in a similar manner to the ext4 hack being described. I'm not sure it will work for smaller filesystems, though - we essentially ignore the inode32 mount option for sufficiently small filesystems. i.e. if inode numbers > 32 bits can't exist, we don't change the allocator, at least not until the filesystem (possibly) gets grown later. So for inode32 to impact behavior, it needs to be on a filesystem of sufficient size (at least 1 or 2T, depending on block size, inode size, etc). Otherwise it will have no effect today. Dave, I wonder if we need another mount option to essentially mean "invoke the inode32 allocator regardless of filesystem size?" -Eric >> Exemple with only 10GiB : >> https://www.aevoo.fr/2016/02/14/ceph-ext4-optimisation-for-filestore/ > > It's a nice toy, but it's not something that is going scale reliably > for production. That caveat at the end: > > "With this model, filestore rearrange the tree very > frequently : + 40 I/O every 32 objects link/unlink." > > Indicates how bad the IO patterns will be when modifying the > directory structure, and says to me that it's not a useful > optimisation at all when you might be creating several thousand > files/s on a filesystem. That will end up IO bound, SSD or not. > > Cheers, > > Dave. > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html