Bypass Cache-Tiering for special reads (Backups)

mail@xxxxxxxxxx (Marc) · Thu, 03 Jul 2014 14:10:18 +0200

On 03/07/2014 07:32, Kyle Bader wrote:
>> I was wondering, having a cache pool in front of an RBD pool is all fine
>> and dandy, but imagine you want to pull backups of all your VMs (or one
>> of them, or multiple...). Going to the cache for all those reads isn't
>> only pointless, it'll also potentially fill up the cache and possibly
>> evict actually frequently used data. Which got me thinking... wouldn't
>> it be nifty if there was a special way of doing specific backup reads
>> where you'd bypass the cache, ensuring the dirty cache contents get
>> written to cold pool first? Or at least doing special reads where a
>> cache-miss won't actually cache the requested data?
>>
>> AFAIK the backup routine for an RBD-backed KVM usually involves creating
>> a snapshot of the RBD and putting that into a backup storage/tape, all
>> done via librbd/API.
>>
>> Maybe something like that even already exists?
> When used in the context of OpenStack Cinder, it does:
>
> http://ceph.com/docs/next/rbd/rbd-openstack/#configuring-cinder-backup
>
> You can have the backup pool use the default crush rules, assuming the
> default isn't your hot pool. Another option might be to put backups on
> an erasure coded pool, I'm not sure if that has been tested, but in
> principle should work since objects composing a snapshot should be
> immutable.
>
Hm... considering that the RBDs are accessed via the default crush rule
due to the overlaying, they dont actually use the cache ruleset for
anything I wouldnt think.

Also the whole idea of backups is to have them on a separate medium.
Storing RBD-backups on Ceph is no better than just taking snapshots and
keeping them around.

Thanks for the input though!