Re: Possibly misleading/outdated documentation about qemu/kvm and rbd cache settings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





2015-02-27 20:56 GMT+08:00 Alexandre DERUMIER <aderumier@xxxxxxxxx>:
Hi,

from qemu rbd.c

    if (flags & BDRV_O_NOCACHE) {
        rados_conf_set(s->cluster, "rbd_cache", "false");
    } else {
        rados_conf_set(s->cluster, "rbd_cache", "true");
    }

and
block.c

int bdrv_parse_cache_flags(const char *mode, int *flags)
{
    *flags &= ~BDRV_O_CACHE_MASK;

    if (!strcmp(mode, "off") || !strcmp(mode, "none")) {
        *flags |= BDRV_O_NOCACHE | BDRV_O_CACHE_WB;
    } else if (!strcmp(mode, "directsync")) {
        *flags |= BDRV_O_NOCACHE;
    } else if (!strcmp(mode, "writeback")) {
        *flags |= BDRV_O_CACHE_WB;
    } else if (!strcmp(mode, "unsafe")) {
        *flags |= BDRV_O_CACHE_WB;
        *flags |= BDRV_O_NO_FLUSH;
    } else if (!strcmp(mode, "writethrough")) {
        /* this is the default */
    } else {
        return -1;
    }

    return 0;
}


So rbd_cache is

disabled for cache=directsync|none

and enabled for writethrough|writeback|unsafe


so directsync or none should be safe if guest does not send flush.



----- Mail original -----
De: "Florian Haas" <florian@xxxxxxxxxxx>
À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
Envoyé: Vendredi 27 Février 2015 13:38:13
Objet: Possibly misleading/outdated documentation about qemu/kvm and rbd cache settings

Hi everyone,

I always have a bit of trouble wrapping my head around how libvirt seems
to ignore ceph.conf option while qemu/kvm does not, so I thought I'd
ask. Maybe Josh, Wido or someone else can clarify the following.

http://ceph.com/docs/master/rbd/qemu-rbd/ says:

"Important: If you set rbd_cache=true, you must set cache=writeback or
risk data loss. Without cache=writeback, QEMU will not send flush
requests to librbd. If QEMU exits uncleanly in this configuration,
filesystems on top of rbd can be corrupted."

Now this refers to explicitly setting rbd_cache=true on the qemu command
line, not having rbd_cache=true in the [client] section in ceph.conf,
and I'm not even sure whether qemu supports that anymore.

Even if it does, I'm still not sure whether the statement is accurate.

qemu has, for some time, had a cache=directsync mode which is intended
to be used as follows (from
http://lists.nongnu.org/archive/html/qemu-devel/2011-08/msg00020.html):

"This mode is useful when guests may not be sending flushes when
appropriate and therefore leave data at risk in case of power failure.
When cache=directsync is used, write operations are only completed to
the guest when data is safely on disk."

So even if there are no flush requests to librbd, users should still be
safe from corruption when using cache=directsync, no?

So in summary, I *think* the following considerations apply, but I'd be
grateful if someone could confirm or refute them: 

cache = writethrough
Maps to rbd_cache=true, rbd_cache_max_dirty=0. Read cache only, safe to
 Actually, qemu doesn't care about the setting rbd_cache_max_dirty. In the mode of writethrough,
qemu always sends flush following every write request.
use whether or not guest I/O stack sends flushes.

cache = writeback
Maps to rbd_cache=true, rbd_cache_max_dirty > 0. Safe to use only if
guest I/O stack sends flushes. Maps to cache = writethrough until first 
Qemu can report to guest if the write cache is enabled and guest kernel can manage the cache
as what it does against volatile writeback cache on physical storage controller
(Please see https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt)
If filesystem barrier is not disabled on guest, it can avoid data corruption.
flush if rbd_cache_writethrough_until_flush = true (default in master).

cache = none
Maps to rbd_cache=false. No caching, safe to use regardless of guest I/O
stack flush support.

cache = unsafe
Maps to rbd_cache=true, rbd_cache_max_dirty > 0, but also *ignores* all
flush requests from the guest. Not safe to use (except in the unlikely
case that your guest never-ever writes).

cache=directsync
Maps to rbd_cache=true, rbd_cache_max_dirty=0. Bypasses the host page
cache altogether, which I think would be meaningless with the rbd
storage driver because it doesn't use the host page cache (unlike
qcow2). Read cache only, safe to use whether or not guest I/O stack
sends flushes.

Is the above an accurate summary? If so, I'll be happy to send a doc patch.

Cheers,
Florian
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux