Re: Local SSD cache for ceph on each compute node.

Nick Fisk <nick@xxxxxxxxxx> · Thu, 17 Mar 2016 10:00:51 -0000

> -----Original Message-----
> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
> Daniel Niasoff
> Sent: 16 March 2016 21:02
> To: Nick Fisk <nick@xxxxxxxxxx>; 'Van Leeuwen, Robert'
> <rovanleeuwen@xxxxxxxx>; 'Jason Dillaman' <dillaman@xxxxxxxxxx>
> Cc: ceph-users@xxxxxxxxxxxxxx
> Subject: Re:  Local SSD cache for ceph on each compute node.
> 
> Hi Nick,
> 
> Your solution requires manual configuration for each VM and cannot be
> setup as part of an automated OpenStack deployment.

Absolutely, potentially flaky as well.

> 
> It would be really nice if it was a hypervisor based setting as opposed to
a VM
> based setting.

Yes, I can't wait until we can just specify "rbd_cache_device=/dev/ssd" in
the ceph.conf and get it to write to that instead. Ideally ceph would also
provide some sort of lightweight replication for the cache devices, but
otherwise a iSCSI SSD farm or switched SAS could be used so that the caching
device is not tied to one physical host.

> 
> Thanks
> 
> Daniel
> 
> -----Original Message-----
> From: Nick Fisk [mailto:nick@xxxxxxxxxx]
> Sent: 16 March 2016 08:59
> To: Daniel Niasoff <daniel@xxxxxxxxxxxxxx>; 'Van Leeuwen, Robert'
> <rovanleeuwen@xxxxxxxx>; 'Jason Dillaman' <dillaman@xxxxxxxxxx>
> Cc: ceph-users@xxxxxxxxxxxxxx
> Subject: RE:  Local SSD cache for ceph on each compute node.
> 
> 
> 
> > -----Original Message-----
> > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf
> > Of Daniel Niasoff
> > Sent: 16 March 2016 08:26
> > To: Van Leeuwen, Robert <rovanleeuwen@xxxxxxxx>; Jason Dillaman
> > <dillaman@xxxxxxxxxx>
> > Cc: ceph-users@xxxxxxxxxxxxxx
> > Subject: Re:  Local SSD cache for ceph on each compute node.
> >
> > Hi Robert,
> >
> > >Caching writes would be bad because a hypervisor failure would result
> > >in
> > loss of the cache which pretty much guarantees inconsistent data on
> > the ceph volume.
> > >Also live-migration will become problematic compared to running
> > everything from ceph since you will also need to migrate the
> local-storage.
> 
> I tested a solution using iSCSI for the cache devices. Each VM was using
> flashcache with a combination of a iSCSI LUN from a SSD and a RBD. This
gets
> around the problem of moving things around or if the hypervisor goes down.
> It's not local caching but the write latency is at least 10x lower than
the RBD.
> Note I tested it, I didn't put it into production :-)
> 
> >
> > My understanding of how a writeback cache should work is that it
> > should only take a few seconds for writes to be streamed onto the
> > network and is focussed on resolving the speed issue of small sync
> > writes. The writes
> would
> > be bundled into larger writes that are not time sensitive.
> >
> > So there is potential for a few seconds data loss but compared to the
> current
> > trend of using ephemeral storage to solve this issue, it's a major
> > improvement.
> 
> Yeah, problem is a couple of seconds data loss mean different things to
> different people.
> 
> >
> > > (considering the time required for setting up and maintaining the
> > > extra
> > caching layer on each vm, unless you work for free ;-)
> >
> > Couldn't agree more there.
> >
> > I am just so surprised how the openstack community haven't looked to
> > resolve this issue. Ephemeral storage is a HUGE compromise unless you
> > have built in failure into every aspect of your application but many
> > people use openstack as a general purpose devstack.
> >
> > (Jason pointed out his blueprint but I guess it's at least a year or 2
> away -
> > http://tracker.ceph.com/projects/ceph/wiki/Rbd_-_ordered_crash-
> > consistent_write-back_caching_extension)
> >
> > I see articles discussing the idea such as this one
> >
> > http://www.sebastien-han.fr/blog/2014/06/10/ceph-cache-pool-tiering-
> > scalable-cache/
> >
> > but no real straightforward  validated setup instructions.
> >
> > Thanks
> >
> > Daniel
> >
> >
> > -----Original Message-----
> > From: Van Leeuwen, Robert [mailto:rovanleeuwen@xxxxxxxx]
> > Sent: 16 March 2016 08:11
> > To: Jason Dillaman <dillaman@xxxxxxxxxx>; Daniel Niasoff
> > <daniel@xxxxxxxxxxxxxx>
> > Cc: ceph-users@xxxxxxxxxxxxxx
> > Subject: Re:  Local SSD cache for ceph on each compute node.
> >
> > >Indeed, well understood.
> > >
> > >As a shorter term workaround, if you have control over the VMs, you
> > >could
> > always just slice out an LVM volume from local SSD/NVMe and pass it
> > through to the guest.  Within the guest, use dm-cache (or similar) to
> > add
> a
> > cache front-end to your RBD volume.
> >
> > If you do this you need to setup your cache as read-cache only.
> > Caching writes would be bad because a hypervisor failure would result
> > in
> loss
> > of the cache which pretty much guarantees inconsistent data on the
> > ceph volume.
> > Also live-migration will become problematic compared to running
> > everything from ceph since you will also need to migrate the
local-storage.
> >
> > The question will be if adding more ram (== more read cache) would not
> > be more convenient and cheaper in the end.
> > (considering the time required for setting up and maintaining the
> > extra caching layer on each vm, unless you work for free ;-) Also
> > reads from
> ceph
> > are pretty fast compared to the biggest bottleneck: (small) sync writes.
> > So it is debatable how much performance you would win except for some
> > use-cases with lots of reads on very large data sets which are also
> > very latency sensitive.
> >
> > Cheers,
> > Robert van Leeuwen
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com