Re: [RFC] finegrained disk driver options control

"Daniel P. Berrange" <berrange@xxxxxxxxxx> · Mon, 20 Mar 2017 09:59:09 +0000

On Mon, Mar 20, 2017 at 11:11:42AM +0300, Denis V. Lunev wrote:
> On 03/18/2017 12:59 PM, Daniel P. Berrange wrote:
> > On Thu, Mar 16, 2017 at 08:31:08PM +0300, Denis V. Lunev wrote:
> >> On 03/16/2017 05:45 PM, Daniel P. Berrange wrote:
> >>> On Thu, Mar 16, 2017 at 05:08:57PM +0300, Denis V. Lunev wrote:
> >>>> Hello, All!
> >>>>
> >>>> There is a problem in the current libvirt implementation. domain.xml
> >>>> allows to specify only basic set of options, especially in the case
> >>>> of QEMU, when there are really a lot of tweaks in format drivers.
> >>>> Most likely these options will never be supported in a good way
> >>>> in libvirt as recognizable entities.
> >>>>
> >>>> Right now in order to debug libvirt QEMU VM in production I am using
> >>>> very strange approach:
> >>>> - disk section of domain XML is removed
> >>>> - exact command line options to start the disk are specified at the end
> >>>>   of domain.xml whithin <qemu:commandline> as described by Stefan
> >>>>  
> >>>> http://blog.vmsplice.net/2011/04/how-to-pass-qemu-command-line-options.html
> >>>>
> >>>> The problem is that when debug is finished and viable combinations of
> >>>> options is found I can not drop VM in such state in the production. This
> >>>> is the pain and problem. For example, I have spend 3 days with the
> >>>> VM of one customer which blames us for slow IO in the guest. I have
> >>>> found very good combination of non-standard options which increases
> >>>> disk performance 5 times (not 5%). Currently I can not put this combination
> >>>> in the production as libvirt does not see the disk.
> >>>>
> >>>> I propose to do very simple thing, may be I am not the first one here,
> >>>> but it would be nice to allow to pass arbitrary option to the QEMU
> >>>> command line. This could be done in a very generic way if we will
> >>>> allow to specify additional options inside <driver> section like this:
> >>>>
> >>>>     <disk type='file' device='disk'>
> >>>>       <driver name='qemu' type='qcow2' cache='none' io='native'
> >>>> iothread='1'>
> >>>>           <option name='l2-cache-size' value='64M/>
> >>>>           <option name='cache-clean-interval' value='32'/>
> >>>>       </driver>
> >>>>       <source file='/var/lib/libvirt/images/rhel7.qcow2'/>
> >>>>       <target dev='sda' bus='scsi'/>
> >>>>       <address type='drive' controller='0' bus='0' target='0' unit='0'/>
> >>>>     </disk>
> >>>>
> >>>> and so on. The meaning (at least for QEMU) is quite simple -
> >>>> these options will just be added to the end of the -drive command
> >>>> line. The meaning for other drivers should be the same and I
> >>>> think that there are ways to pass generic options in them.
> >>> It is a general policy that we do *not* do generic option passthrough
> >>> in this kind of manner. We always want to represent concepts explicitly
> >>> with named attributes, so that if 2 hypervisors support the same concept
> >>> we can map it the same way in the XML
> >>>
> >> In general this policy means that the management software which
> >> wants to implement some differentiation in between VMs f.e.
> >> in disk tuning is forced to use qemu:commandline backdoor.
> >> That is a pity. Exactly like in the case with additional logs.
> >>
> >> Thank you for the discussion. At least I have found new way
> >> to perform some fine tuning.
> > Ignoring the question of generic option passthrough, I think we can model
> > the cache settings in libvirt XML explicitly. Other types of disk besides
> > qcow2 can have a cache concept, so I think we could create something like
> > this:
> >
> >    <driver name='qemu' type='qcow2' ....>
> >        <cache>
> >           <clean interval="2" unit="seconds"/>
> >           <bank name="l2" size="1024" unit="KiB"/>
> >           <bank name="refcount" size="1024" unit="KiB"/>
> >        </cache>
> >    </driver>
> >
> > The "bank" element would be permitted to be repeated multiple times if a
> > particular diskk driver had multiple caches it needed.
> >
> > In the storage vol XML, we would want a way top report what the size of
> > the L2 and refcount tables are when reporting qcow2 volumes, so apps
> > know the maximum sensible size to use for cache.
> >
> For cache and anything which could be bound as cache this is not that
> difficult. But are you going to limit possible bank names? Without
> the limit this would work exactly the same as I have proposed.
> With the limit, i.e. understanding of  allowed banks on a format
> basis, we will stuck in a really LOT of details.

It would certainly validate the cache names matched those supported by
the image format, as well as validating the values are in integer format.
As mentioned, the storage volume XML would also report the cache supported
by each storage volume in a storage pool, providing applications the way
to learn what caches are available for the format. This is very different
to just providing blind passthrough of any qcow option.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list