Re: [RFC] finegrained disk driver options control

"Denis V. Lunev" <den@xxxxxxxxxxxxx> · Thu, 16 Mar 2017 19:56:12 +0300



On 03/16/2017 06:20 PM, Daniel P. Berrange wrote:
> On Thu, Mar 16, 2017 at 06:15:27PM +0300, Denis V. Lunev wrote:
>> On 03/16/2017 06:08 PM, Daniel P. Berrange wrote:
>>> On Thu, Mar 16, 2017 at 06:00:46PM +0300, Denis V. Lunev wrote:
>>>> On 03/16/2017 05:45 PM, Daniel P. Berrange wrote:
>>>>> On Thu, Mar 16, 2017 at 05:08:57PM +0300, Denis V. Lunev wrote:
>>>>>> Hello, All!
>>>>>>
>>>>>> There is a problem in the current libvirt implementation. domain.xml
>>>>>> allows to specify only basic set of options, especially in the case
>>>>>> of QEMU, when there are really a lot of tweaks in format drivers.
>>>>>> Most likely these options will never be supported in a good way
>>>>>> in libvirt as recognizable entities.
>>>>>>
>>>>>> Right now in order to debug libvirt QEMU VM in production I am using
>>>>>> very strange approach:
>>>>>> - disk section of domain XML is removed
>>>>>> - exact command line options to start the disk are specified at the end
>>>>>>   of domain.xml whithin <qemu:commandline> as described by Stefan
>>>>>>  
>>>>>> http://blog.vmsplice.net/2011/04/how-to-pass-qemu-command-line-options.html
>>>>>>
>>>>>> The problem is that when debug is finished and viable combinations of
>>>>>> options is found I can not drop VM in such state in the production. This
>>>>>> is the pain and problem. For example, I have spend 3 days with the
>>>>>> VM of one customer which blames us for slow IO in the guest. I have
>>>>>> found very good combination of non-standard options which increases
>>>>>> disk performance 5 times (not 5%). Currently I can not put this combination
>>>>>> in the production as libvirt does not see the disk.
>>>>>>
>>>>>> I propose to do very simple thing, may be I am not the first one here,
>>>>>> but it would be nice to allow to pass arbitrary option to the QEMU
>>>>>> command line. This could be done in a very generic way if we will
>>>>>> allow to specify additional options inside <driver> section like this:
>>>>>>
>>>>>>     <disk type='file' device='disk'>
>>>>>>       <driver name='qemu' type='qcow2' cache='none' io='native'
>>>>>> iothread='1'>
>>>>>>           <option name='l2-cache-size' value='64M/>
>>>>>>           <option name='cache-clean-interval' value='32'/>
>>>>>>       </driver>
>>>>>>       <source file='/var/lib/libvirt/images/rhel7.qcow2'/>
>>>>>>       <target dev='sda' bus='scsi'/>
>>>>>>       <address type='drive' controller='0' bus='0' target='0' unit='0'/>
>>>>>>     </disk>
>>>>>>
>>>>>> and so on. The meaning (at least for QEMU) is quite simple -
>>>>>> these options will just be added to the end of the -drive command
>>>>>> line. The meaning for other drivers should be the same and I
>>>>>> think that there are ways to pass generic options in them.
>>>>> It is a general policy that we do *not* do generic option passthrough
>>>>> in this kind of manner. We always want to represent concepts explicitly
>>>>> with named attributes, so that if 2 hypervisors support the same concept
>>>>> we can map it the same way in the XML
>>>> OK. How could I change L2 cache size for QCOW2 image?
>>>>
>>>> For 1 Tb disk, fragmented in guest, the performance loss is
>>>> around 10 times. 10 TIMES. 1000%. The customer could not
>>>> wait until proper fix in the next QEMU release especially
>>>> if we are able to provide the kludge specifically for him.
>>> We can explicitly allow L2 cache size set in the XML but that
>>> is a pretty poor solution to the problem IMHO, as the mgmt
>>> application has no apriori knowledge of whether a particular
>>> cache size is going to be right for a particular QCow2 image.
>>>
>>> For a sustainable solution, IMHO this really needs to be fixed
>>> in QEMU so it has either a more appropriate default, or if a
>>> single default is not possible, have QEMU auto-tune its cache
>>> size dynamically to suit the characteristics of the qcow2 image.
>> Yes, I agree. That is why I am spoken about the kludge.
>>
>>
>>>> There is an option <qemu:commandline> which specifically
>>>> works like this. It is enabled specifically with changed scheme.
>>>> OK, we can have this option enabled only under the same
>>>> condition. But we have to have a way to solve the problem
>>>> at the moment. Not in 3 month of painful dances within
>>>> the driver. May be with limitations line increased memory
>>>> footprint, but still.
>>> Sure, you can use <qemu:commandline> passthrough - that is the explicit
>>> temporary workaround - we don't provide any guarantee that your guest
>>> won't break when upgrading either libvirt or QEMU though, hence we
>>> mark it as tainted.
>> No and yes. Yes, <qemu:commandline> partially solves the situation.
>> No this solution has tooo strong drawbacks IMHO. The configuration
>> of this VM could not be changed anymore in any viable way and
>> there are a lot of problems as one disk is absent at libvirt level.
>>
>> Can we add the option when the VM config is tainted and debug
>> scheme enabled specifically to the disk level? This would the best
>> partial solution, which will not ruin other management tasks
>> like backup, disk add etc.
> We really don't want to propagate the custom passthrough into further
> areas of the XML. It is intentionally limited because it is not
> something we want people/apps to use for anything other than a short
> term hack. You should be able to use qemu's  '-set' argument to set
> fields against existing QEMU args, without having to throw away the
> entire libvirt <disk> config built by libvirt.
>
> Regards,
> Daniel
Technically this solves the problem, but still we need to specify options
without such hacks.

Den

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list