Re: [PATCH v6 0/2] Add support for qcow2 cache

John Ferlan <jferlan@xxxxxxxxxx> · Wed, 4 Oct 2017 17:51:19 -0400

On 09/27/2017 11:23 PM, Liu Qing wrote:
> On Wed, Sep 27, 2017 at 02:07:02PM -0400, John Ferlan wrote:
>>
>>
>> On 09/27/2017 08:42 AM, Pavel Hrdina wrote:
>>> On Wed, Sep 27, 2017 at 02:24:12PM +0200, Pavel Hrdina wrote:
>>>> On Tue, Sep 26, 2017 at 05:06:37PM -0400, John Ferlan wrote:
>>>>>
>>>>>
>>>>> On 09/18/2017 11:45 PM, Liu Qing wrote:
>>>>>> Qcow2 small IO random write performance will drop dramatically if the l2
>>>>>> cache table could not cover the whole disk. This will be a lot of l2
>>>>>> cache table RW operations if cache miss happens frequently.
>>>>>>  
>>>>>> This patch exports the qcow2 driver parameter
>>>>>> l2-cache-size/refcount-cache-size, first added in Qemu 2.2, and
>>>>>> cache-clean-interval, first added in Qemu 2.5, in libvirt.
>>>>>>
>>>>>> change since v4: removed unnecessary cache error check
>>>>>>
>>>>>> Liu Qing (2):
>>>>>>   conf, docs: Add qcow2 cache configuration support
>>>>>
>>>>> According to what you added for the v5 - there's an upstream bz that
>>>>> could be associated with this work.  Is this something that should be
>>>>> added to the commit message so that we can "track" where the idea comes
>>>>> from?
>>>>>
>>>>> Beyond that - I agree setting policy is not the business we want to be
>>>>> in. I just figured I'd throw it out there. I didn't read all of the qemu
>>>>> links from the bz about thoughts on this - too many links too much to
>>>>> follow not enough time/desire to think about it.
>>>>>
>>>>> I'm not against pushing these patches, but would like to at least make
>>>>> sure Peter isn't 100% against them. I agree the attributes are really
>>>>> painful to understand, but it's not the first time we've added some
>>>>> attribute that comes with a beware of using this unless you know what
>>>>> you're doing.
>>>>
>>>> I'm still not convinced that this is something we would like to expose
>>>> to users.  It's way too low-level for libvirt XML.  I like the idea to
>>>> allow 'max' as possible value in QEMU and that could be reused by
>>>> libvirt.
>>>>
>>>> One may argue that having two options as "max storage" or
>>>> "max performace" isn't good enough and they would like to be able to
>>>> tune it for they exact needs.  Most of the users don't care about the
>>>> specifics and they just need a performance or not.
>>>>
>>>> I attempted to introduce "polling" feature [1] for iothreads but that
>>>> was a bad idea and too low-level for libvirt users and configuring exact
>>>> values for qcow2 cache seems to be the same case.
>>>>
>>>> Adding an element/attribute into XML where you would specify whether you
>>>> care about performance or not isn't a policy.  Libvirt will simply use
>>>> some defaults or configure the maximum performance.  For fine-grained
>>>> tuning you can use <qemu:arg value='...'/> like it was suggested for
>>>> "polling" feature.
>>>>
>>>> Pavel
>>>
>>> Ehm, I forgot to include the link to the "polling" discussion:
>>>
>>> [1] <https://www.redhat.com/archives/libvir-list/2017-February/msg01084.html>
>>>
>>
>> I scanned the .1 discussion... it seems as though there were adjustments
>> made to the default iothreads to make them useful or at least enabled by
>> default a better set of values which could be disabled via some
>> <qemu:arg value='-newarg'/> setting.
>>
>> From the discussion thus far for qcow2 cache the difference here is that
>> the default values QEMU uses have been inadequate for real world type
>> usages and it doesn't appear convergence is possible because the
>> possible adjustments has vary degrees of issues. I think it's nicely
>> summed up here :
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1377735#c2 and
>> https://bugzilla.redhat.com/show_bug.cgi?id=1377735#c3
>>
>> Following the link in the bz from comment 5 to the qemu patches shows
>> only a couple of attempts at trying something before the idea was
>> seemingly dropped. I looked forward and only recently was the idea
>> resurrected, but perhaps in perhaps a bit different format.
>>
>> https://lists.gnu.org/archive/html/qemu-block/2017-09/msg00635.html
> Looked into the thread, not sure whether the author will change the originial
> qcow2 cache configuration. Personally I don't think so, he may add a new
> parameter which could be used to configure how much memory will be consumed
> for each "slice" as you suggested.
> 
> If the cache is separated to size smaller than cluster, this will lead
> to more frequent IO access. Athough memory could be used more
> efficiently. Mmm... this should not be in this thread.
>>
>> In any case, I've sent an email to the author of those patches to get
>> insights regarding adding exposure from libvirt is a good or not and to
>> determine if the new proposal somehow overrides any settins that would
>> be made. It'd be awful if a value was provided, but how that value is
>> used gets changed...
>>
>> John

I've had my head down in other rabbit holes, but this one has surfaced
again and I would like to think we could come to a consensus and/or
conclusion early in this 3.9.0 release so that we can make a decision to
merge these patches or officially NACK them.

I contacted the QEMU author (email attached for those interested in more
reading materials and he's also CC'd in this response) to allow him to
read the context of the concerns thus far and of course provide any
other thoughts or answer questions directly if necessary.

I certainly understand the position to not include something that's
difficult to describe/configure; however, if by all accounts certain
configurations can receive greater performance/throughput given the
ability to configure the parameters, then I'd hedge on the side of I
think we should merge the patches (with some adjustments that I'm
willing to make to use 3.9.0 instead of 3.8.0 in the documentation of
where the patches got merged).

While looking at something else recently - it dawned on me that these
changes do not include any tests for hot plug... and then of course is
that something that would (be expected to) work and whether that option
was tested by the author.

If we do merge them it doesn't seem from Berto's response that future
QEMU adjustments in this area will be affected if someone does configure
and provide the parameters.

John

--- Begin Message ---

Subject: Re: Question regarding l2_cache_size as it could relate to libvirt...
From: Alberto Garcia <berto@xxxxxxxxxx>
Date: Thu, 28 Sep 2017 15:13:42 +0200
In-reply-to: <9f4a301a-f69f-521f-b6ea-c233b6a7ee07@redhat.com>
References: <9f4a301a-f69f-521f-b6ea-c233b6a7ee07@redhat.com>
User-agent: Mutt/1.5.23 (2014-03-12)

Hi John,

thanks for contacting me on this! I don't normally read libvirt-devel
so I was not aware of this, although I do remember bug #1377735 (as
you have probably seen I wrote some comments there). There was some
discussion in qemu-block but in the end nothing was agreed and no code
was committed.

I wrote a blog post explaining how to configure l2_cache_size,
refcount_cache_size, and cache_clean_interval (it's linked from that
bug report). That's the current status of the qcow2 cache in QEMU.

I haven't thought a lot about this topic, but here are some ideas:

   - Being able to change the cache size is important (especially the
     L2 cache) because it has a very big impact on performance.

   - Figuring out how much cache you need is not something immediately
     obvious (its size depends on the size of the image AND its
     cluster size). Ideally the end user should not need to know the
     details of how to calculate this, but I don't know how feasible
     this is in practice. Then again the formula itself is not so
     complicated if you summarize it with a table like this:

         |--------------+--------------------------|
         | Cluster size | Max cache per GB of disk |
         |--------------+--------------------------|
         |          512 |  16 MB                   |
         |         1 KB |   8 MB                   |
         |         2 KB |   4 MB                   |
         |         4 KB |   2 MB                   |
         |         8 KB |   1 MB                   |
         |        16 KB | 512 KB                   |
         |        32 KB | 256 KB                   |
         |        64 KB | 128 KB                   |
         |       128 KB |  64 KB                   |
         |       256 KB |  32 KB                   |
         |       512 KB |  16 KB                   |
         |         1 MB |   8 KB                   |
         |         2 MB |   4 KB                   |
         |--------------+--------------------------|

   - Those parameters are not going away, and as far as I'm aware
     there's no plan to deprecate them.

   - In any case both l2_cache_size and refcount_cache_size specify
     the *maximum* cache size. If you set l2_cache_size=128M but then
     you only need 2, it will only use 2MB.

   - My current work (in RFC as you have seen) is about making the
     cache more efficient (you get more performance for the same cache
     size) but none of the formulas for calculating the cache size
     change. In other words: it should not affect this discussion at
     all.

   - I think cache_clean_interval is the least controversial of them
     all: it removes unused cache entries after N seconds, that has
     nothing to do with the cache size or the properties of the disk
     image, it has more to do with the usage patterns of the disk
     image, and it's an option that is easy to understand.

   - I haven't analyzed refcount_cache_size much, I don't think it's
     has a big effect on performance, but I never tested it.

So the main problem here is how to specify the L2 cache size from
libvirt (I'm ignoring the refcount cache as I said earlier). I think
there are two main ways to do that:

   a) The user knows the formula from the table above and decides the
      exact cache size. That would be a one-to-one mapping between the
      libvirt setting and the QEMU setting.

   b) The user knows the maximum percentage of disk that is going to
      be covered by the L2 cache and sets the %. This will allow the
      user to say "I want full performance (i.e. full cache size) for
      this image no matter its size".

And of course a combination of both (i.e. full cache size up to a
maximum of x MB). I think they are two valid use cases, so I can
understand that people would like to have both covered. I don't have a
strong opinion to be honest, but feel free to discuss it in qemu-block
or if you want to put me in Cc in the libvirt-devel discussion I can
also answer questions there.

I hope this helped!

Berto

On Wed, Sep 27, 2017 at 02:09:37PM -0400, John Ferlan wrote:
> 
> 
> Berto -
> 
> Looking for some thoughts or advice...  I'm a libvirt developer who's
> been reviewing patches from an upstream contributor to allow setting
> with the qcow2 l2_cache_size, refcount_cache_size, and
> cache_clean_interval from XML. This is essentially related to
> 
>  https://bugzilla.redhat.com/show_bug.cgi?id=1377735
> 
> The most recent patches are here:
> 
> https://www.redhat.com/archives/libvir-list/2017-September/msg00553.html
> 
> I don't follow qemu-devel that closely nor do I have the in depth
> knowledge of the problem space, but notice you recently posted an RFC
> related to l2_cache_size.
> 
> So if you have a few cycles...
> 
> 1. Any feeling positively or negatively towards adding the ability from
> libvirt to tweak those parameters?
> 
> 2. If your current RFC eventually ends up making changes - would the
> changes ignore or use the existing values?
> 
> There's some pushback from other libvirt team members over providing XML
> for those parameters. I'm 50/50 - I see value, but I also understand the
> trepidation about adding such parameters since they are so specific and
> require someone to read something from the QEMU source tree in order to
> understand how to set properly based on the environment being used.
> 
> On other hand - if your future work is going to make it so that if
> someone does provide a value that may in the future then be really bad
> to provide, then perhaps we should avoid allowing setting of the
> parameters. If though your new work involves some new parameter that
> supercedes the existing values, then I suppose libvirt exposing the
> existing parameter doesn't hurt. Just leaves it up to someone to expose
> the new value.
> 
> Thanks in advance -
> 
> John

--- End Message ---
--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list