Re: Possibly misleading/outdated documentation about qemu/kvm and rbd cache settings

Florian Haas <florian@xxxxxxxxxxx> · Fri, 27 Feb 2015 22:55:22 +0100

On 02/27/2015 02:46 PM, Mark Wu wrote:
> 
> 
> 2015-02-27 20:56 GMT+08:00 Alexandre DERUMIER <aderumier@xxxxxxxxx
> <mailto:aderumier@xxxxxxxxx>>:
> 
>     Hi,
> 
>     from qemu rbd.c
> 
>         if (flags & BDRV_O_NOCACHE) {
>             rados_conf_set(s->cluster, "rbd_cache", "false");
>         } else {
>             rados_conf_set(s->cluster, "rbd_cache", "true");
>         }
> 
>     and
>     block.c
> 
>     int bdrv_parse_cache_flags(const char *mode, int *flags)
>     {
>         *flags &= ~BDRV_O_CACHE_MASK;
> 
>         if (!strcmp(mode, "off") || !strcmp(mode, "none")) {
>             *flags |= BDRV_O_NOCACHE | BDRV_O_CACHE_WB;
>         } else if (!strcmp(mode, "directsync")) {
>             *flags |= BDRV_O_NOCACHE;
>         } else if (!strcmp(mode, "writeback")) {
>             *flags |= BDRV_O_CACHE_WB;
>         } else if (!strcmp(mode, "unsafe")) {
>             *flags |= BDRV_O_CACHE_WB;
>             *flags |= BDRV_O_NO_FLUSH;
>         } else if (!strcmp(mode, "writethrough")) {
>             /* this is the default */
>         } else {
>             return -1;
>         }
> 
>         return 0;
>     }
> 
> 
>     So rbd_cache is
> 
>     disabled for cache=directsync|none
> 
>     and enabled for writethrough|writeback|unsafe
> 
> 
>     so directsync or none should be safe if guest does not send flush.
> 
> 
> 
>     ----- Mail original -----
>     De: "Florian Haas" <florian@xxxxxxxxxxx <mailto:florian@xxxxxxxxxxx>>
>     À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx
>     <mailto:ceph-users@xxxxxxxxxxxxxx>>
>     Envoyé: Vendredi 27 Février 2015 13:38:13
>     Objet:  Possibly misleading/outdated documentation about
>     qemu/kvm and rbd cache settings
> 
>     Hi everyone,
> 
>     I always have a bit of trouble wrapping my head around how libvirt seems
>     to ignore ceph.conf option while qemu/kvm does not, so I thought I'd
>     ask. Maybe Josh, Wido or someone else can clarify the following.
> 
>     http://ceph.com/docs/master/rbd/qemu-rbd/ says:
> 
>     "Important: If you set rbd_cache=true, you must set cache=writeback or
>     risk data loss. Without cache=writeback, QEMU will not send flush
>     requests to librbd. If QEMU exits uncleanly in this configuration,
>     filesystems on top of rbd can be corrupted."
> 
>     Now this refers to explicitly setting rbd_cache=true on the qemu command
>     line, not having rbd_cache=true in the [client] section in ceph.conf,
>     and I'm not even sure whether qemu supports that anymore.
> 
>     Even if it does, I'm still not sure whether the statement is accurate.
> 
>     qemu has, for some time, had a cache=directsync mode which is intended
>     to be used as follows (from
>     http://lists.nongnu.org/archive/html/qemu-devel/2011-08/msg00020.html):
> 
>     "This mode is useful when guests may not be sending flushes when
>     appropriate and therefore leave data at risk in case of power failure.
>     When cache=directsync is used, write operations are only completed to
>     the guest when data is safely on disk."
> 
>     So even if there are no flush requests to librbd, users should still be
>     safe from corruption when using cache=directsync, no?
> 
>     So in summary, I *think* the following considerations apply, but I'd be
>     grateful if someone could confirm or refute them: 
> 
> 
>     cache = writethrough
>     Maps to rbd_cache=true, rbd_cache_max_dirty=0. Read cache only, safe to
> 
>  Actually, qemu doesn't care about the setting rbd_cache_max_dirty. In
> the mode of writethrough,
> qemu always sends flush following every write request.

So how exactly is that functionally different from rbd_cache_max_dirty=0?

>     use whether or not guest I/O stack sends flushes.
> 
>     cache = writeback
>     Maps to rbd_cache=true, rbd_cache_max_dirty > 0. Safe to use only if
>     guest I/O stack sends flushes. Maps to cache = writethrough until first 
> 
> Qemu can report to guest if the write cache is enabled and guest kernel
> can manage the cache
> as what it does against volatile writeback cache on physical storage
> controller
> (Please see
> https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt)
> If filesystem barrier is not disabled on guest, it can avoid data
> corruption.

You mean block barriers? I thought those were killed upstream like 4
years ago.

Cheers,
Florian

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com