Re: Review of Ceph on ZFS - or how not to deploy Ceph for RBD + OpenStack

Patrick Donnelly <pdonnell@xxxxxxxxxx> · Wed, 11 Jan 2017 01:23:50 -0500

Hello Kevin,

On Tue, Jan 10, 2017 at 4:21 PM, Kevin Olbrich <ko@xxxxxxx> wrote:
> 5x Ceph node equipped with 32GB RAM, Intel i5, Intel DC P3700 NVMe journal,

Is the "journal" used as a ZIL?

> We experienced a lot of io blocks (X requests blocked > 32 sec) when a lot
> of data is changed in cloned RBDs (disk imported via OpenStack Glance,
> cloned during instance creation by Cinder).
> If the disk was cloned some months ago and large software updates are
> applied (a lot of small files) combined with a lot of syncs, we often had a
> node hit suicide timeout.
> Most likely this is a problem with op thread count, as it is easy to block
> threads with RAIDZ2 (RAID6) if many small operations are written to disk
> (again, COW is not optimal here).
> When recovery took place (0.020% degraded) the cluster performance was very
> bad - remote service VMs (Windows) were unusable. Recovery itself was using
> 70 - 200 mb/s which was okay.

I would think having an SSD ZIL here would make a very large
difference. Probably a ZIL may have a much larger performance impact
than an L2ARC device. [You may even partition it and have both but I'm
not sure if that's normally recommended.]

Thanks for your writeup!

-- 
Patrick Donnelly
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com