Re: [PATCH] KVM: Introduce dynamically registered hypercall capability

Radim Krčmář <rkrcmar@xxxxxxxxxx> · Tue, 2 Dec 2014 17:08:33 +0100

(tl;dr version at the bottom)

2014-12-01 15:43-0800, Phil White:
> On Mon, Dec 1, 2014 at 5:47 AM, Radim Krčmář <rkrcmar@xxxxxxxxxx> wrote:
> > 2014-11-28 17:29-0800, Phil White:
> >> Good questions.
> >>
> >> One thing that prompted this code is the presence and proliferation of
> >> architecture specific hypercalls in include/uapi/linux/kvm_para.h.
> >> I'm not sure why the tree has continued on for as long as it has with
> >> a list of reserved hypercall indices -- most of which are unused on
> >> any given architecture.  Is there a reason that I'm unaware of?
> >
> > Name-space won't be exhausted, so nothing forced them to separate and
> > centralization easily avoids conflicts with generic hypercalls.
> 
> Consider: All the arch specific defines were defined in asm/kvm_para.h.
> Each asm/kvm_para.h defines KVM_HC_ARCH_MAX as a relative
> index from which generic hypercalls ought to be applied (e.g.
> KVM_HC_ARCH_MAX+1 rather than 11).

Every hypercall is still hardcoded on host's side, so this is a harmful
complication.  (When compared with current state and pre-defined arch
space, like -1,-2,...)

The guest can deal with it quite easily, a special hypercall that
returns the first generic one, but there is no real reason to.

> This would at least organize the hypercalls and avoid a situation in which a
> number of hypercalls which are not applicable pollute the namespace.  I'll
> grant that the grounds here may be largely aesthetic.

(It is a commendable pursuit, working code definitely isn't enough.)

> The other worry is that institutionalization of this method will lead to a
> hesitance to associate a specific hypercall index with anything other than the
> function which it has been assigned in past kernel revisions.

Yes, we'd have keep "legacy" hypercalls.
Breaking old guests isn't what we want.

> In addition, this leads to a maintenance problem for anyone seeking to add a
> hypercall in the future in which their hypercalls will need to be
> updated in order
> to avoid collisions with the community as well as any other sources they may
> be dealing with.

(I think that this wouldn't increase the load on maintainers by much.)

> These are all minor headaches, but they can be avoided.  A registration
> method like this -- albeit somewhat more refined -- could be used to eliminate
> all of those headaches in my opinion.

What worries me is the hypercall negotiation ...
If we added truly dynamic hypercall numbers, then the guest would have
to agree with host on function/position of hypercalls.

This has a major drawback:  host and guest have no common definition for
hypercalls => they do not know what the other is talking about.
This can be solved by introducing a "hypercall protocol", which it is
just a more round-about way of having hardcoded ids ...

(You did that by having shared memory that exposed a structure that was
 decoded by your guest.)

> > I'd say that a virtio device is the way to go if you want to stay in the
> > kernel, but outside of kvm modules.  In which ways is virtio lacking?
> 
> Virtio has several limitations.  It implies a situation in which the system has
> already booted.  Secondly, there's no easy way to access the kvm structure.
> Thirdly, it cannot be used effectively to implement an optimization for
> virtualization on a platform.  Fourthly, I believe it would require changes to
> qemu command lines -- and any associated tools which might be used to
> cobble together a qemu command line.

True, I misunderstood the scope of your modification.  I think it would
be "easier" to merge the paravirtualization into KVM+Linux ...

Calling your code by live-patching the hypercall handler could be
mentioned as an easy solution, but it has its problems ...
(A continued use of forked kernel is definitely the easiest.)

> A simple way of putting it, using the existing in-kernel code: I don't see how
> you could use virtio to map the powerpc magic page at bootup.

Agreed, and this code dwells in KVM modules because of that.

(I wasn't talking about existing hypercalls, just foreign modules.)

> >> It does occur to me that in the absence of the setup which I had
> >> available, one could simply treat hc_nr as a 4 character ID rather
> >> than a particular digit.
> >
> > (This would probably solve the situation in practice, but the conflict
> >  is still there, so design hasn't improved.)
> 
> I'm not sure which conflict you mean.  I presume you mean the possibility
> that two separate modules may attempt to claim the same hypercall index?

Yes, integer -> char[4] just switched from sequential assignment to a
"random" one and shrinked the space.  (If people used random generator
for new hypercall numbers, it would have a similar effect.)

> Presuming you do -- and I may be arguing a straw man here -- I'm not sure
> that's classifiable as a design flaw as no method occurs to me by which
> one could add the capability of dynamically registering a hypercall and have
> access to the capabilities I mentioned above.

Linux has everything it cares about in one tree, so it probably isn't
dynamic in that sense -- solution is the hardcoded reservation.
You know how many hypercalls your code is going to need ...

> However, I can be trivially proven wrong by the suggestion of a design by
> which an out-of-tree module could dynamically register a hypercall, have
> access to the kvm structure (at a minimum), and avoid the possibility that
> some other user may race to claim that hypercall.

Sorry, neither I think this is possible:
a module can mimic the registration procedure of other module to the
point where it can't be distinguished in the guest.
(Modules can do *everything*, but we keep in-tree ones working.)

---
I'll try to gather the points above:
- Dynamic hypercall ids boil down to something hardcoded, so we
  actually gain by having a static system.
  (=> you should drop the dynamic part of the patch)

- Current mix of arch+generic is acceptable and function pointers in a
  list/tree don't improve it *at all*.

- I don't like that we would add code that has no in-tree use.
  What does Linux gain by having this?
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html