Re: libvirt profiles (a.k.a. virtuned) design ideas draft

Martin Kletzander <mkletzan@xxxxxxxxxx> · Tue, 17 Jul 2018 12:11:31 +0200

On Mon, Jul 09, 2018 at 06:10:28PM +0100, Daniel P. Berrangé wrote:
On Mon, Jul 09, 2018 at 05:01:25PM +0300, Martin Kletzander wrote:
On Thu, Jul 05, 2018 at 05:58:46PM +0100, Daniel P. Berrangé wrote:
> On Tue, Jul 03, 2018 at 04:41:52PM -0400, Cole Robinson wrote:
> > > ## Brief specification of functionality
> > >
> > > Currently virtuned aims to provide a consistent way of applying profiles to
> > > libvirt VM definitions.  That way management applications don't need to
> > > duplicate the implementation in their codebases.
> > >
> > > ### Functions
> > >
> > > As a starting point virtuned exposes one function.  As input the function
> > > accepts a VM definition with the only restriction being that it is a libvirt
> > > domain XML.  However it doesn't have to be complete.  The function applies all
> > > relevant profiles to that XML and produces a complete libvirt domain XML.
> > >
> > > The outcome of this is twofold:
> > > - Every libvirt domain XML is already working virtuned XML.
> > > - Applications can select, by arbitrarily small steps, how much functionality
> > >   they want to use from virtuned.
>
> I'm not sure I understand this second point. IIUC, the contents of the profiles
> are supposed to be opaque to the mgmt application. So while they use virtuned,
> they'll be exposed to whatever arbitrary XML the profile contains, whether
> they understand it or not.
>

Why would they need to be opaque to the mgmt app?  Either you are using some of
profiles that are shipped with it (in which case the mgmt app developers should
know what they are using in the code) or the mgmt app can construct their own
profile to be used in which case it should know what it is asking for.

In previous discussions on this topic is was suggested that the selling point
for profiles was to allow new features to be enabled in multiple mgmt apps
without having to add support to each mgmt app to format XML, potentially with
the end user providing arbitrary profiles. This implies that the mgmt app
considers the profile contents to be opaque. Based on your answer though, it
seems this is not in fact a goal.

It is not.  The first idea was basically just creating basic XML from simple
bits and pieces.  It got out of hand when people started suggesting features up
to the point in which it got ridiculous IMHO.

Not allowing arbitrary black-box profiles would indeed be my preference,
since I don't think it is practical to support it in the real world given
the complex interactions that will fall out of that.

We definitely need the mgmt app to understand either the profile or the result
of the profile being used (the output XML).  And it needs to happen before the
VM is being scheduled because there might be things that the scheduler might
need to handle.

> > > ### API endpoints ###
> > >
> > > For now the API will be exposed as:
> > >
> > > 1. Python module - trivial if we're basing it on virt-manager codebase which is
> > >    using python
>
> What's the key reasons/benefit to be part of virt-manager codebase as opposed
> to a standalone project ?
>

Few things:

1) The XMLBuilder makes it easier to work with the XML, particularly the domain
   XML.  This is not that big of a deal since libvirt-go-xml does a good job of
   that as well

2) There is an existing logic for "intermediate" devices.  By that I mean the
   devices that are needed to add the requested one.  For example when
   requesting an addition of a SATA disk, there is already a logic that figures
   out if there is an existing SATA controller with a free slot and adds one if
   there is not.  The reason for this is that there might be some defaults
   specified which affect the intermediate devices.

3) The possibility of exposing virt-xml and virt-install in the future.  The
   former would be used for making changes to the XML and the latter is
   something that stateless mgmt apps would like to use (cockpit currently).

FWIW, great as virt-install is, if I was writing a new mgmt application, I'd
really use GNOME Boxes installer as the benchmark. Most importantly it is
able to fully automate the installation process from installer media, by
generating the requisite kickstart files from data provided by libosinfo.

The installation process does not concern me.  It is simply the XML creation
process that it makes it easier.  And for an MVP it is easier to get up to speed
for me with Python.  But it's not a hard requirement, I don't care what the
language is as long as it doesn't add more time than what the benefit is.

Lot of this would go away if there was a way to make libvirt process the VM
definition with only the necessary changes.  It would also help with the
addresses being figured out without running full-blown libvirt daemon.  Maybe it
will be easier once the split is done?  I don't know.

But thanks for the gnome boxes idea, I'll forward that to the Cockpit dev so
that they can consider it.

> > > The above example will request a video card with model QXL to exist in the VM
> > > definition.  The precise outcome of this depends on the existing devices in the
> > > VM definition:
> > >
> > > - **VM has no video device:** the XML snippet (`qxl` video card) will simply be
> > >   added to the list of devices.
> > > - **VM has video device with no model specified:** Just fill in the video model
> > >   for the existing video card.
> > > - **VM has video device with different model:** Add one more video device with
> > >   the specified model since multiple video cards are perfectly fine.
> > >
> > > The above is very concrete example, but it can be very easily and efficiently
> > > generalized for any `<add/>` sub-element.  The only information which is
> > > required for said generalization is the knowledge of libvirt's domain XML
> > > format.  This could be one of the reasons for virtuned to be spun off of
> > > virt-manager's codebase (since most of that information is already there).  The
> > > other option would be using
> > > [libvirt-go-xml](https://libvirt.org/git/?p=libvirt-go-xml.git) as that should
> > > have enough information for this as well <sup id='fn3'>[[3]](#fn3d)</sup>.
>
> FYI, libvirt-go-xml should have 100% coverage of all XML constructs in the
> libvirt schema. Any ommissions are entirely due to libvirt's own master XML
> test files being incomplete. libvirt-go-xml unit tests check that it can
> roundtrip all XML files in libvirt.git without data loss. I don't think any
> other XML parser impl for libvirt has the same level of coverage, principally
> because none of them do similar kind of testing to prove it.
>

Coverage is one thing, but another thing is the logic that is in XMLBuilder
(even though it's not there for all the elements).  For example if there are
different sub-elements allowed based on an attribute.  But even simpler,
elements that cannot be duplicated, but in the struct it is saved in a list.  If
that is not fully introspectable from the struct tags, then we will need to
duplicate the code that already exists in virt-manager if this is a side
project.

The way I've modelled things in Go is that when there is a type=XXXX attribute
that controls which sub-elements are permitted, I've created dedicated structs
for each sub-schema. In fact you never set any 'type' attribute - we generate
the type attribute based on which struct you've created for the child content.

Oh, then I didn't look enough at the implementation.  If that is guaranteed,
then it helps a lot.

> Solving these problems would require a combinatorial expansion in the
> number of profiles. eg a numa-pc, numa-q35 profile, and then a
> networking-nfv-pc, networking-nfv-q46, networking-nfv-numa-pc, and
> networking-nfv-numa-q35 profiles. There would then have to be dependancies
> expressed to tell the app which profiles can be composed with each other.
>

So this is how tuned does it and I didn't really like the way the matrix
explodes with added dimensions.

At least with tuned I think the range of profiles is probably fairly
small, since there's only so many tunables that are going to be
relevant. With the domain XML, our schema is huge, so I could easily
imagine getting into high double-figures number of profiles. So this
will explode the matrix way worse than seen with tuned.

> This still only solves the problem of composing profiles, and does not
> consider how to merge with the application defined XML parts. The only
> way an application can know if the XML it wants to write, is compatible
> with the profiles it has used, is if it parses and understands all the
> parts of the profile.
>

I hear what you are saying, but I don't see why the app would need to parse the
profiles.  There can be conditions in profiles (proposed in open questions) that
would eliminated the need for multiple profiles for the same thing.  Yes, DSL
would be better for this.  We could just right away use what "xq" provides (see
open questions).  That would also solve erroring out.

My point touches slightly in the possible misunderstanding I mention above
about the scope wrt allowing end user blackbox profiles to be provided.

> If something was used in the profile that the app doesn't know about,
> it could ignore it, but the resulting VM config may well be unrunnable,
> or worse, runnable but doing something completely inappropriate.
>
>
> I think these kind of problems are inherant in any approach which allows
> arbitrary user defined XML as the schema for the profiles.
>
> This is one of reasons why libosinfo didn't base the information it
> provides around the libvirt XML schema. Instead it defines its own
> domain specific language, and applications only use the features in
> it that they actually know how to handle.
>
> This means if we add some new concept to libosinfo database, applications
> are not going to automagically use it, and instead have to add explicit
> support. As above though, I think this is inevitable, because it is too
> easy to create unrunnable/nonsensical XML configs if you allow arbitrary
> user specified XML inputs.
>

Thanks for the info with the NUMA locality example.  On one hand it would really
save us a lot of work if we just used something that exists (by just extending
it) and for DSL there is a solution we can use as well.  If not then we can
build it from existing parts at least partially.

BTW, I meant to include this link to illustate the NUMA locality example:

 https://www.berrange.com/posts/2017/02/16/setting-up-a-nested-kvm-guest-for-developing-testing-pci-device-assignment-with-numa/

I'll have a look at that.  We had some talk about the details and it looks like
targetting such advanced features is not the goal now.  We need to keep that in
mind, of course, but we need to start small as there are still numerous
misunderstandings about what we're going to do.

> > I didn't really know where to cut in so this is a big comment...
> >
> > The idea here is that virtuned will ship with something like a
> > profile/add-qxl.xml, and profile=add-qxl will then effectively be part
> > of the virtuned API, like an osinfo ID value is to libosinfo; the
> > profile will never go away, so apps can depend on it being there.
> > Presumably we can extend the profile as necessary as long as it
> > accomplishes its stated goal and we confirm it doesn't break apps.
> >

Yes, we're probably going to need to version it as well.

Hmm, yes, versioning would be key for being able to reconstruct the
exact same machine each time, even after upgrades. That said, it would
be valid to declare that profiles need to be persisted at time of VM
creation, per VM. This is how openstack deals with its "flavour"
concept - at time of VM create we copy the data for the flavour, so
we always used the original values for life of that specific VM.

So the problem in KubeVirt is that the changes need to be done either
immediately after posting the VM definition (without libvirtd running at all) or
it needs to be reproducible.  Versioning will also help us to be able to change
virtually anything in the future.

> > Using XML for this kind of thing makes me nervous, trying to model
> > conditional actions with XML. I feel like it's a real quick slippery
> > slope to implementing a turing complete schema. For example how would we
> > handle complex examples like:
> >

The idea to use XML was sparkled by two facts:

1) Apps will be able to create their own profiles.

2) Simple profiles (addition of few elements) could be created by just taking
   the specific part of the domain XML and wrapping it in a tag that says what
   to do (e.g. `<add><existing_xml_snippet/></add>`).

FWIW, I'm not opposed to using XML - I think it is valuable to be able
to use standardized tools for parsing / formatting / editor syntax
highligting etc. I'm just wary about using the Domain XML schema itself,
as opposed to a custom XML schema explicitly designed for this job. If
nothing else, we've got lots of stupid mistakes in our domain XML schema,
such as the way we litter CPU/NUMA related bits across 6 different places
in the schema, making it hard to understand wtf we're expressing.

What I was afraid about was creating Yet Another VM Definition Format.  Slightly
unrelated, but I always get reminded of this: https://xkcd.com/927/

> > What's the motivation for doing this in XML? So apps or distros can drop
> > in their own profiles? Or extend system profiles? I'm wondering why XML
> > over privately implemented. Maybe you can explain some specific app
> > usecases that motivated this? I feel like I missed a lot in the previous
> > discussion
> >

You didn't miss much and you hit the two points nicely, dropping in own profiles
and, possibly, extend existing ones.

> > Also do we expect the API to talk directly to libvirt? Like for checking
> > domcapabilities?
>

For KubeVirt that wouldn't be that much of a help as they need to do bunch of
these things without libvirt running.  Also not being dependent on libvirt makes
it independent from the host.  Capabilities might be provided as another input,
but question is whether it should be full blown libvirt (dom)capabilities.  The
reason is that you might need to migrate between various nodes and the mgmt
app/cluster knows the minimal requirements better than host-oriented daemon.

I don't think it is so clearcut for KubeVirt. It is entirely possible for them
to have a libvirtd spawned to be able to query the capabilities, independantly
of them launching the guest if this is a compelling benefit. It dalso depends
on exactly where in their code flow they'll slot in the usage and expansion
of profiles into full domain XML.

So some of the things are scheduler-related, so it needs to be done before the
cluster is trying to figure out where the VM is going to be scheduled (based on
the amount of RAM, some devices, whatever else).  Others need to be done after
the Pod is created (for example vCPU pinning based on what vCPUs the Pod will
get allocated).

> I tend to think writing the profiles is going to be more complex and
> error prone than directly writing the XML, because of the composability
> problems I mention above.
>
> My gut feeling is that it would be a more tractable problem if the profiles
> used a domain specific language (DSL), possibly still XML, but not libvirt
> domain XML. Applications would have to explicitly know about individual
> features in the DSL, but they could consume it in a way that the way they
> generate libvirt XML is more fully data-driven.
>
> ie, taking my example above, applications would need explicit knowledge
> of machine types, NUMA topologies, and attaching devices to NUMA nodes.
> Given that knowledge though, the decision about /when/ to use these
> respective features would be data driven from profiles that simply
> stated desired traits.
>

I lost you at the last paragraph.  Could you rephrase it or maybe give another
example?  The idea is that mgmt app knows when it wants to use what profile.
And what is provided as an API is the composition of the XML.  But you were
probably addressing something else, right?  As I said, I lost you here.

This does back to the question of scope wrt whether profiles are blackboxes
that administrators can augment at will, or whether it is strictly limited
to stuff the application developer has decided to express. If it is the
latter, then it simplifies the process of expanding the profile to form
domain XML.

To be clear though, my thought was that if you have a DSL, you could say

 "Place guest on host node 0"

in the profile, and the application would have logic to turn that into
the domain XML that sets appropriate NUMA tunables in the various different
places, giving the application to customize them taking into account other
factors. For example, the app might have been told not to use host CPUs
0 and 1, as they're reserved for OS processes. It can use that knowledge
to filter out pinning to CPUs 0 and 1, and only pin to CPUs 2-3 in node.

If the profile is expressed in terms of domain XML, then the profile would
be encoding specific host CPU information, and the application would have
to parse the domain XML and modify all the places which list CPUs to
remove CPUs 0 and 1. So in that sense having the profile use domain XML
isn't really simplifying life for the app - it would have been easier to
just generate the domain XML from scratch rather than parse & modify
what was written in the profile to remove 2 CPUs.

Good point.  Thanks for the ideas.  I'll keep them in mind although, as I said,
it's not very defined what we're trying to achieve so I'm trying to frame that
at the same time.

Have a nice day,
Martin
Attachment:
signature.asc

Description: Digital signature
_______________________________________________
virt-tools-list mailing list
virt-tools-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/virt-tools-list