Re: [PATCH 0/9] qemu: support passt as the backend for vhost-user network interfaces

Stefano Brivio <sbrivio@xxxxxxxxxx> · Fri, 14 Feb 2025 16:36:43 +0100

On Fri, 14 Feb 2025 06:47:53 -0800
Andrea Bolognani <abologna@xxxxxxxxxx> wrote:

> On Fri, Feb 14, 2025 at 09:08:36AM -0500, Laine Stump wrote:
> > On 2/14/25 6:17 AM, Andrea Bolognani wrote:  
> > > Speaking of SELinux, with the current policy on Fedora 41 I get a
> > > couple of AVC denials related to accessing the shared memory file.
> > > I understand that's expected, based on the above, but it's still
> > > quite surprising to me that the VM would start at all in this case.
> > >
> > > Just like the scenario that I've mentioned in my reply to 9/9, the
> > > network interface quietly being broken doesn't make for a great user
> > > experience. I believe this specific failure scenario, unlike the
> > > other one, is pre-existing and not easy to deal with purely through
> > > XML validation, but I really think that we should spend some effort
> > > (as a follow-up) on making sure that, if passt can't set up the
> > > network interface successfully, we report a useful error to the user
> > > instead of just leaving things broken with no clear indication that
> > > they are.  
> >
> > (I guess you're talking about reporting the selinux denial?)
> >
> > The difficulty here is that it's not libvirt getting the selinux denial,
> > it's passt and/or QEMU, and we don't report errors from either of those
> > unless they are fatal (i.e. the other process exits right away with an error
> > code). Practically speaking though, the selinux issue you're seeing should
> > never happen in production (as long as all packages are up to date) so I
> > don't think it's as essential as the share memory config thing to figure out
> > all the contortions necessary to report it. (Translated: "Error reporting is
> > hard!!!! Let's all go shopping!!!!"). If you've got any bright ideas feel
> > free to pontificate though :-)  
> 
> I haven't looked into it in any detail, so no specific suggestions.
> And I agree that it won't be seen in production so we can proceed as
> is for now, and only consider improving things further as a
> follow-up.
> 
> In abstract terms, we need to be able to catch startup errors from
> passt more consistently. As a point of comparison, swtpm will
> complain very loudly and refuse to start if it can't manufacture a
> TPM device; passt being unable to create the network interface
> backend or connect to the frontend should result in a similar VM
> startup failure.
> 
> My impression is that in many cases passt will attempt to proceed and
> simply log an error message on failure, possibly because the
> underlying problem can be fixed after the fact.

In this case what you see is not a desired behaviour of passt and it's
a bit more complicated compared to the non-vhost-user case (hence the
issue that nobody had thought about).

With or without --vhost-user, passt goes to background once its
*control interface* is up and running, which should make it convenient
for libvirt, libkrun, and even "manual" users (typically using scripts).

You start it without having to fork. If it forks and exits
successfully, you know it's ready. Nothing fancy here, that's a pretty
much established UNIX daemon thing.

In the non-vhost-user case, that socket is also the data interface, so
not much can go wrong after this phase. QEMU might fail to start
(or fail to connect to passt and hence fail to start), and in that case
libvirt will terminate passt, but that's about it.

In the vhost-user case, passt still forks once the control interface is
up and running: libvirt knows that QEMU can start, now. But the data
connection is not ready, yet, because that's negotiated as part of the
vhost-user protocol.

So, QEMU connects, and passes to passt a passing file descriptor, en
passant, via SCM_RIGHTS (admittedly a bit passé), to represent the
shared (guest) memory that passt can map. If passt can't map the memory,
it fails with a resounding:

	die_perror("vhost-user region mmap error");

but that's too late: passt declared success, the guest started, and
libvirt can't do much. We don't have guarantees as to when this failure
can happen, either.

Without vhost-user, libvirt gets a NETDEV_STREAM_DISCONNECTED if passt
terminates for whatever reason, and it handles that by restarting
passt. QEMU's interface is configured with a --reconnect-ms option, so
restarting passt should be enough to give the guest its connectivity
back. For vhost-user, a patch from Laurent would introduce equivalent
functionality:

  https://lore.kernel.org/qemu-devel/20250214072629.1033314-1-lvivier@xxxxxxxxxx/

but that only helps when things are up and running. If failure happens
before the guest ever had working connectivity, it's pretty unclear to
me what the desired behaviour is. And:

> In the context of
> libvirt, I don't think this applies. So maybe we need a "strict mode"
> of sorts, where passt is more willing to call it quits whenever
> something doesn't work?

...while this already happens...

> I don't know how feasible that is. It's entirely possible that I have
> incorrectly described how passt does error handling. All I know for
> sure is that the current situation, in which a VM can successfully be
> started despite ending up with a non-working network interface, is
> very clearly not acceptable in the long run.

...should libvirt really bring down the guest because oops, on a second
thought, your network interface will never work? Again, we have no
timing guarantees. Or perhaps QEMU should terminate instead?

Are you aware of any similar case we can try to use as established
practice?

-- 
Stefano