Re: Possibly misleading/outdated documentation about qemu/kvm and rbd cache settings

Alexandre DERUMIER <aderumier@xxxxxxxxx> · Fri, 27 Feb 2015 13:56:41 +0100 (CET)

Hi,

from qemu rbd.c

    if (flags & BDRV_O_NOCACHE) {
        rados_conf_set(s->cluster, "rbd_cache", "false");
    } else {
        rados_conf_set(s->cluster, "rbd_cache", "true");
    }

and
block.c

int bdrv_parse_cache_flags(const char *mode, int *flags)
{
    *flags &= ~BDRV_O_CACHE_MASK;

    if (!strcmp(mode, "off") || !strcmp(mode, "none")) {
        *flags |= BDRV_O_NOCACHE | BDRV_O_CACHE_WB;
    } else if (!strcmp(mode, "directsync")) {
        *flags |= BDRV_O_NOCACHE;
    } else if (!strcmp(mode, "writeback")) {
        *flags |= BDRV_O_CACHE_WB;
    } else if (!strcmp(mode, "unsafe")) {
        *flags |= BDRV_O_CACHE_WB;
        *flags |= BDRV_O_NO_FLUSH;
    } else if (!strcmp(mode, "writethrough")) {
        /* this is the default */
    } else {
        return -1;
    }

    return 0;
}

So rbd_cache is 

disabled for cache=directsync|none

and enabled for writethrough|writeback|unsafe

so directsync or none should be safe if guest does not send flush.

----- Mail original -----
De: "Florian Haas" <florian@xxxxxxxxxxx>
À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
Envoyé: Vendredi 27 Février 2015 13:38:13
Objet:  Possibly misleading/outdated documentation about qemu/kvm and rbd cache settings

Hi everyone, 

I always have a bit of trouble wrapping my head around how libvirt seems 
to ignore ceph.conf option while qemu/kvm does not, so I thought I'd 
ask. Maybe Josh, Wido or someone else can clarify the following. 

http://ceph.com/docs/master/rbd/qemu-rbd/ says: 

"Important: If you set rbd_cache=true, you must set cache=writeback or 
risk data loss. Without cache=writeback, QEMU will not send flush 
requests to librbd. If QEMU exits uncleanly in this configuration, 
filesystems on top of rbd can be corrupted." 

Now this refers to explicitly setting rbd_cache=true on the qemu command 
line, not having rbd_cache=true in the [client] section in ceph.conf, 
and I'm not even sure whether qemu supports that anymore. 

Even if it does, I'm still not sure whether the statement is accurate. 

qemu has, for some time, had a cache=directsync mode which is intended 
to be used as follows (from 
http://lists.nongnu.org/archive/html/qemu-devel/2011-08/msg00020.html): 

"This mode is useful when guests may not be sending flushes when 
appropriate and therefore leave data at risk in case of power failure. 
When cache=directsync is used, write operations are only completed to 
the guest when data is safely on disk." 

So even if there are no flush requests to librbd, users should still be 
safe from corruption when using cache=directsync, no? 

So in summary, I *think* the following considerations apply, but I'd be 
grateful if someone could confirm or refute them: 

cache = writethrough 
Maps to rbd_cache=true, rbd_cache_max_dirty=0. Read cache only, safe to 
use whether or not guest I/O stack sends flushes. 

cache = writeback 
Maps to rbd_cache=true, rbd_cache_max_dirty > 0. Safe to use only if 
guest I/O stack sends flushes. Maps to cache = writethrough until first 
flush if rbd_cache_writethrough_until_flush = true (default in master). 

cache = none 
Maps to rbd_cache=false. No caching, safe to use regardless of guest I/O 
stack flush support. 

cache = unsafe 
Maps to rbd_cache=true, rbd_cache_max_dirty > 0, but also *ignores* all 
flush requests from the guest. Not safe to use (except in the unlikely 
case that your guest never-ever writes). 

cache=directsync 
Maps to rbd_cache=true, rbd_cache_max_dirty=0. Bypasses the host page 
cache altogether, which I think would be meaningless with the rbd 
storage driver because it doesn't use the host page cache (unlike 
qcow2). Read cache only, safe to use whether or not guest I/O stack 
sends flushes. 

Is the above an accurate summary? If so, I'll be happy to send a doc patch. 

Cheers, 
Florian 
_______________________________________________ 
ceph-users mailing list 
ceph-users@xxxxxxxxxxxxxx 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com