Re: [PATCH] qemu-img: Add nocache command line option

Kevin Wolf <kwolf@xxxxxxxxxx> · Wed, 15 Jun 2011 14:26:37 +0200

Am 15.06.2011 14:15, schrieb Federico Simoncelli:
> ----- Original Message -----
>> From: "Kevin Wolf" <kwolf@xxxxxxxxxx>
>> To: "Federico Simoncelli" <fsimonce@xxxxxxxxxx>
>> Cc: kvm@xxxxxxxxxxxxxxx, "Avi Kivity" <avi@xxxxxxxxxx>
>> Sent: Wednesday, June 15, 2011 1:45:02 PM
>> Subject: Re: [PATCH] qemu-img: Add nocache command line option
>> Am 15.06.2011 13:17, schrieb Federico Simoncelli:
>>> ----- Original Message -----
>>>> From: "Kevin Wolf" <kwolf@xxxxxxxxxx>
>>>> To: "Avi Kivity" <avi@xxxxxxxxxx>
>>>> Cc: "Federico Simoncelli" <fsimonce@xxxxxxxxxx>,
>>>> kvm@xxxxxxxxxxxxxxx
>>>> Sent: Wednesday, June 15, 2011 12:50:21 PM
>>>> Subject: Re: [PATCH] qemu-img: Add nocache command line option
>>>>
>>>> Your patch confuses a write-through cache (cache=writethrough ==
>>>> O_SYNC)
>>>> with bypassing the cache (cache=none == O_DIRECT), which are
>>>> entirely
>>>> different.
>>>
>>> I know the difference, how am I confusing them in the patch?
>>> I am using BDRV_O_NOCACHE which is translated to O_DIRECT, which is
>>> what I wanted to use (as per commit message: nocache).
>>> Maybe I overlooked something, can you point me to the code parts you
>>> suppose are wrong?
>>
>> The first thing that I noticed is that you call it write-through in
>> the
>> help message. But you also unset BDRV_O_CACHE_WB, while cache=none is
>> a
>> write-back cache mode, in fact.
> 
> You are totally right, I overlooked the help message!
> What confuses me is that cache=none shouldn't be write-back (strictly
> speaking), but it probably does just to avoid the use of O_DIRECT
> with O_DSYNC:

And this is exactly what makes it a write-back mode.

I know that people often don't expect this, but even though we're
bypassing the host page cache, it's a write-back mode. The data might be
cached in a volatile write cache of the disk, for example.

> [blockdev.c:330]
> 
>   if (!strcmp(buf, "off") || !strcmp(buf, "none")) {
>       bdrv_flags |= BDRV_O_NOCACHE | BDRV_O_CACHE_WB;
>   } else if [...]
> 
> [block/raw-posix.c:203]
> 
>   if ((bdrv_flags & BDRV_O_NOCACHE))
>       s->open_flags |= O_DIRECT;
>   if (!(bdrv_flags & BDRV_O_CACHE_WB))
>       s->open_flags |= O_DSYNC;
> 
>>>>> IMO better to use the existing qemu option format, say -o
>>>>> cache=writeback|writethrough|none|volatile.
>>>>
>>>> No, -o is used with qemu-img create for options that change what
>>>> the
>>>> image will look like (e.g. setting flags in the image header). This
>>>> options is about the cache mode that is used for various
>>>> operations.
>>>
>>> Actually I am just working over this. I am going to add a qemu-img
>>> specific option, roughly something like this:
>>>
>>> static QEMUOptionParameter qemuimg_options[] = {
>>>     {
>>>         .name = QEMUIMG_OPT_CACHE,
>>>         .type = OPT_STRING,
>>>         .help = "Write cache method"
>>>     },
>>>     { NULL }
>>> };
>>> [...]
>>>     create_options = append_option_parameters(create_options,
>>>                                               qemuimg_options);
>>> [...]
>>>     /* Setting output cache flag */
>>>     out_flags = BDRV_O_RDWR | BDRV_O_NO_FLUSH;
>>>     ret = qemuimg_option_cache(param, &out_flags);
>>>     if (ret < 0) {
>>>         goto out;
>>>     }
>>
>> My concerns were not about the implementation (I'm sure that you can
>> do
>> that), but rather on a conceptual level. Persistent image options are
>> semantically different from options only used for the qemu-img
>> operation, so mixing the two things gives you a bad user interface.
> 
> What I understood was that we were trying to conform to the qemu-kvm
> -drive option which already looks very similar:
> 
>  cache=<mode>,format=<fmt>,...
> 
> So in the final analysis which one do you prefer?
> 
>  qemu-img -t none|writeback|writethrough ...
>  qemu-img -o cache=none|writeback|writethrough ...

I would pick the former.

Maybe we can even use a cache=<mode>,format=<fmt> style thing that
resembles -drive. Just don't make it -o but something separate.

Kevin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html