Re: Local SSD cache for ceph on each compute node.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Robert,

>Caching writes would be bad because a hypervisor failure would result in loss of the cache which pretty much guarantees inconsistent data on the ceph volume.
>Also live-migration will become problematic compared to running everything from ceph since you will also need to migrate the local-storage.

My understanding of how a writeback cache should work is that it should only take a few seconds for writes to be streamed onto the network and is focussed on resolving the speed issue of small sync writes. The writes would be bundled into larger writes that are not time sensitive.

So there is potential for a few seconds data loss but compared to the current trend of using ephemeral storage to solve this issue, it's a major improvement.

> (considering the time required for setting up and maintaining the extra caching layer on each vm, unless you work for free ;-)

Couldn't agree more there.

I am just so surprised how the openstack community haven't looked to resolve this issue. Ephemeral storage is a HUGE compromise unless you have built in failure into every aspect of your application but many people use openstack as a general purpose devstack.

(Jason pointed out his blueprint but I guess it's at least a year or 2 away - http://tracker.ceph.com/projects/ceph/wiki/Rbd_-_ordered_crash-consistent_write-back_caching_extension)

I see articles discussing the idea such as this one 

http://www.sebastien-han.fr/blog/2014/06/10/ceph-cache-pool-tiering-scalable-cache/

but no real straightforward  validated setup instructions.

Thanks 

Daniel


-----Original Message-----
From: Van Leeuwen, Robert [mailto:rovanleeuwen@xxxxxxxx] 
Sent: 16 March 2016 08:11
To: Jason Dillaman <dillaman@xxxxxxxxxx>; Daniel Niasoff <daniel@xxxxxxxxxxxxxx>
Cc: ceph-users@xxxxxxxxxxxxxx
Subject: Re:  Local SSD cache for ceph on each compute node.

>Indeed, well understood.
>
>As a shorter term workaround, if you have control over the VMs, you could always just slice out an LVM volume from local SSD/NVMe and pass it through to the guest.  Within the guest, use dm-cache (or similar) to add a cache front-end to your RBD volume.  

If you do this you need to setup your cache as read-cache only. 
Caching writes would be bad because a hypervisor failure would result in loss of the cache which pretty much guarantees inconsistent data on the ceph volume.
Also live-migration will become problematic compared to running everything from ceph since you will also need to migrate the local-storage.

The question will be if adding more ram (== more read cache) would not be more convenient and cheaper in the end.
(considering the time required for setting up and maintaining the extra caching layer on each vm, unless you work for free ;-) Also reads from ceph are pretty fast compared to the biggest bottleneck: (small) sync writes.
So it is debatable how much performance you would win except for some use-cases with lots of reads on very large data sets which are also very latency sensitive.

Cheers,
Robert van Leeuwen

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux