Re: [PATCH v3 3/6] vbus: add a "vbus-proxy" bus model for vbus_driver objects

Gregory Haskins <gregory.haskins@xxxxxxxxx> · Tue, 18 Aug 2009 09:16:45 -0400

Anthony Liguori wrote:
> Gregory Haskins wrote:
>> Note: No one has ever proposed to change the virtio-ABI.
> 
> virtio-pci is part of the virtio ABI.  You are proposing changing that.

I'm sorry, but I respectfully disagree with you here.

virtio has an ABI...I am not modifying that.

virtio-pci has an ABI...I am not modifying that either.

The subsystem in question is virtio-vbus, and is a completely standalone
addition to the virtio ecosystem.  By your argument, virtio amd
virtio-pci should fuse together, and virtio-lguest and virtio-s390
should go away because they diverge from the virtio-pci ABI, right?

I seriously doubt you would agree with that statement.  The fact is, the
design of virtio not only permits modular replacement of its transport
ABI, it encourages it.

So how is virtio-vbus any different from the other three?  I understand
that it means you need to load a new driver in the guest, and I am ok
with that.  virtio-pci was once a non-upstream driver too and required
someone to explicitly load it, wasn't it?  You gotta crawl before you
can walk...

> 
> You cannot add new kernel modules to guests and expect them to remain
> supported.

??? Of course you can.  How is this different from any other driver?

>  So there is value in reusing existing ABIs

Well, I wont argue with you on that one.  There is certainly value there.

My contention is that sometimes the liability of that ABI is greater
than its value, and thats when its time to evaluate the design decisions
that lead to re-use vs re-design.

> 
>>> I think the reason vbus gets better performance for networking today is
>>> that vbus' backends are in the kernel while virtio's backends are
>>> currently in userspace.
>>>     
>>
>> Well, with all due respect, you also said initially when I announced
>> vbus that in-kernel doesn't matter, and tried to make virtio-net run as
>> fast as venet from userspace ;)  Given that we never saw those userspace
>> patches from you that in fact equaled my performance, I assume you were
>> wrong about that statement.  Perhaps you were wrong about other things
>> too?
>>   
> 
> I'm wrong about a lot of things :-)  I haven't yet been convinced that
> I'm wrong here though.
> 
> One of the gray areas here is what constitutes an in-kernel backend. 
> tun/tap is a sort of an in-kernel backend.  Userspace is still involved
> in all of the paths.  vhost seems to be an intermediate step between
> tun/tap and vbus.  The fast paths avoid userspace completely.  Many of
> the slow paths involve userspace still (like migration apparently). 
> With vbus, userspace is avoided entirely.  In some ways, you could argue
> that slirp and vbus are opposite ends of the virtual I/O spectrum.
> 
> I believe strongly that we should avoid putting things in the kernel
> unless they absolutely have to be.

I would generally agree with you on that.  Particularly in the case of
kvm, having slow-path bus-management code in-kernel is not strictly
necessary because KVM has qemu in userspace.

The issue here is that vbus is designed to be a generic solution to
in-kernel virtual-IO.  It will support (via abstraction of key
subsystems) a variety of environments that may or may not be similar in
facilities to KVM, and therefore it represents the
least-common-denominator as far as what external dependencies it requires.

The bottom line is this: despite the tendency for people to jump at
"don't put much in the kernel!", the fact is that a "bus" designed for
software to software (such as vbus) is almost laughably trivial.  Its
essentially a list of objects that have an int (dev-id) and char*
(dev-type) attribute.  All the extra goo that you see me setting up in
something like the kvm-connector needs to be done for fast-path
_anyway_, so transporting the verbs to query this simple list is not
really a big deal.

If we were talking about full ICH emulation for a PCI bus, I would agree
with you.  In the case of vbus, I think its overstated.

>  I'm definitely interested in playing
> with vhost to see if there are ways to put even less in the kernel.  In
> particular, I think it would be a big win to avoid knowledge of slots in
> the kernel by doing ring translation in userspace.

Ultimately I think that would not be a very good proposition.  Ring
translation is actually not that hard, and that would definitely be a
measurable latency source to try and do as you propose.  But, I will not
discourage you from trying if that is what you want to do.

>  This implies a
> userspace transition in the fast path.  This may or may not be
> acceptable.  I think this is going to be a very interesting experiment
> and will ultimately determine whether my intuition about the cost of
> dropping to userspace is right or wrong.

I can already tell you its wrong, just based on the fact that even extra
kthread switches can hurt from my own experience playing in this area...

> 
> 
>> Conversely, I am not afraid of requiring a new driver to optimize the
>> general PV interface.  In the long term, this will reduce the amount of
>> reimplementing the same code over and over, reduce system overhead, and
>> it adds new features not previously available (for instance, coalescing
>> and prioritizing interrupts).
>>   
> 
> I think you have a lot of ideas and I don't know that we've been able to
> really understand your vision.  Do you have any plans on writing a paper
> about vbus that goes into some of your thoughts in detail?

I really need to, I know...

> 
>>> If that's the case, then I don't see any
>>> reason to adopt vbus unless Greg things there are other compelling
>>> features over virtio.
>>>     
>>
>> Aside from the fact that this is another confusion of the vbus/virtio
>> relationship...yes, of course there are compelling features (IMHO) or I
>> wouldn't be expending effort ;)  They are at least compelling enough to
>> put in AlacrityVM.
> 
> This whole AlactricyVM thing is really hitting this nail with a
> sledgehammer.

Note that I didn't really want to go that route.  As you know, I tried
pushing this straight through kvm first since earlier this year, but I
was met with reluctance to even bother truly understanding what I was
proposing, comments like "tell me your ideas so I can steal them", and
"sorry, we are going to reinvent our own instead".  This isn't exactly
going to motivate someone to continue pushing these ideas within that
community.  I was made to feel (purposely?) unwelcome at times.  So I
can either roll over and die, or start my own project.

In addition, almost all of vbus is completely independent of kvm anyway
(I think there are only 3 patches that actually touch KVM, and they are
relatively minor).  And vbus doesn't really fit into any other category
of maintained subsystem either.  So it really calls for a new branch of
maintainership, of which I currently sit.  AlacrityVM will serve as the
collaboration point of that maintainership.

The bottom line is, there are people out there who are interested in
what we are doing (and that number grows everyday).  Starting a new
project wasn't what I wanted per se, but I don't think there was much
choice.

  While the kernel needs to be very careful about what it
> pulls in, as long as you're willing to commit to ABI compatibility, we
> can pull code into QEMU to support vbus.  Then you can just offer vbus
> host and guest drivers instead of forking the kernel.

Ok, I will work on pushing those patches next.

> 
>>   If upstream KVM doesn't want them, that's KVMs
>> decision and I am fine with that.  Simply never apply my qemu patches to
>> qemu-kvm.git, and KVM will be blissfully unaware if vbus is present.
> 
> As I mentioned before, if you submit patches to upstream QEMU, we'll
> apply them (after appropriate review).  As I said previously, we want to
> avoid user confusion as much as possible.  Maybe this means limiting it
> to -device or a separate machine type.  I'm not sure, but that's
> something we can discussion on qemu-devel.

Ok.

Kind Regards,
-Greg

Attachment:
signature.asc

Description: OpenPGP digital signature