On Wed, Oct 10, 2018 at 8:48 PM Kjetil Joergensen <kjetil@xxxxxxxxxxxx> wrote: > > Hi, > > We tested bcache, dm-cache/lvmcache, and one more which name eludes me with PCIe NVME on top of large spinning rust drives behind a SAS3 expander - and decided this were not for us. > > This was probably jewel with filestore, and our primary reason for trying to go down this path were that leveldb compaction were killing us, and putting omap/leveldb and things on separate locations were "so-so" supported (IIRC: some were explicitly supported, some you could do a bit of symlink or mount trickery). > > The caching worked - although, when we started doing power failure survivability (power cycle the entire rig, wait for recovery, repeat), we ended up with seriously corrupted the XFS filesystems on top of the cached block device within a handful of power cycles). We did not test fully disabling the spinning rust on-device cache (which were the leading hypothesis of why this actually failed, potentially combined with ordering of FLUSH+FUA ending up slightly funky combined with the rather asymmetric commit latency). Just to rule out anything else, we did run the same power-fail test regimen for days without the nvme-over-spinning-rust-caching, without triggering the same filesystem corruption. > > So yea - I'd recommend looking at i.e. bluestore and stick rocksdb, journal and anything else performance critical on faster storage instead. > > If you do decide to go down the dm-cache/lvmcache/(other cache) road - I'd recommend throughly testing failure scenarios like i.e. power-loss so you don't find out accidentally when you do have a multi-failure-domain outage. :) Yeah, definitely do a lot of pulling disks and power cycle testing. dm-cache had a data corruption on power loss bug in 4.9+: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5b1fe7bec8a8d0cc547a22e7ddc2bd59acd67de4 Thanks, Ilya _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com