Re: guest / host buffer sharing ...

David Stevens <stevensd@xxxxxxxxxxxx> · Wed, 11 Dec 2019 14:08:10 +0900

There are three issues being discussed here that aren't being clearly
delineated: sharing guest allocated memory with the host, sharing host
allocated memory with the guest, and sharing buffers between devices.

Right now, guest allocated memory can be shared with the host through
the virtqueues or by passing a scatterlist in the virtio payload (i.e.
what virtio-gpu does). Host memory can be shared with the guest using
the new shared memory regions. As far as I can tell, these mechanisms
should be sufficient for sharing memory between the guest and host and
vice versa.

Where things are not sufficient is when we talk about sharing buffers
between devices. For starters, a 'buffer' as we're discussing here is
not something that is currently defined by the virtio spec. The
original proposal defines a buffer as a generic object that is guest
ram+id+metadata, and is created by a special buffer allocation device.
With this approach, buffers can be cleanly shared between devices.

An alternative that Tomasz suggested would be to avoid defining a
generic buffer object, and instead state that the scatterlist which
virtio-gpu currently uses is the 'correct' way for virtio device
protocols to define buffers. With this approach, sharing buffers
between devices potentially requires the host to map different
scatterlists back to a consistent representation of a buffer.

None of the proposals directly address the use case of sharing host
allocated buffers between devices, but I think they can be extended to
support it. Host buffers can be identified by the following tuple:
(transport type enum, transport specific device address, shmid,
offset). I think this is sufficient even for host-allocated buffers
that aren't visible to the guest (e.g. protected memory, vram), since
they can still be given address space in some shared memory region,
even if those addresses are actually inaccessible to the guest. At
this point, the host buffer identifier can simply be passed in place
of the guest ram scatterlist with either proposed buffer sharing
mechanism.

I think the main question here is whether or not the complexity of
generic buffers and a buffer sharing device is worth it compared to
the more implicit definition of buffers. Personally, I lean towards
the implicit definition of buffers, since a buffer sharing device
brings a lot of complexity and there aren't any clear clients of the
buffer metadata feature.

Cheers,
David

On Thu, Dec 5, 2019 at 7:22 AM Dylan Reid <dgreid@xxxxxxxxxxxx> wrote:
>
> On Thu, Nov 21, 2019 at 4:59 PM Tomasz Figa <tfiga@xxxxxxxxxxxx> wrote:
> >
> > On Thu, Nov 21, 2019 at 6:41 AM Geoffrey McRae <geoff@xxxxxxxxxxxxxxx> wrote:
> > >
> > >
> > >
> > > On 2019-11-20 23:13, Tomasz Figa wrote:
> > > > Hi Geoffrey,
> > > >
> > > > On Thu, Nov 7, 2019 at 7:28 AM Geoffrey McRae <geoff@xxxxxxxxxxxxxxx>
> > > > wrote:
> > > >>
> > > >>
> > > >>
> > > >> On 2019-11-06 23:41, Gerd Hoffmann wrote:
> > > >> > On Wed, Nov 06, 2019 at 05:36:22PM +0900, David Stevens wrote:
> > > >> >> > (1) The virtio device
> > > >> >> > =====================
> > > >> >> >
> > > >> >> > Has a single virtio queue, so the guest can send commands to register
> > > >> >> > and unregister buffers.  Buffers are allocated in guest ram.  Each buffer
> > > >> >> > has a list of memory ranges for the data. Each buffer also has some
> > > >> >>
> > > >> >> Allocating from guest ram would work most of the time, but I think
> > > >> >> it's insufficient for many use cases. It doesn't really support things
> > > >> >> such as contiguous allocations, allocations from carveouts or <4GB,
> > > >> >> protected buffers, etc.
> > > >> >
> > > >> > If there are additional constrains (due to gpu hardware I guess)
> > > >> > I think it is better to leave the buffer allocation to virtio-gpu.
> > > >>
> > > >> The entire point of this for our purposes is due to the fact that we
> > > >> can
> > > >> not allocate the buffer, it's either provided by the GPU driver or
> > > >> DirectX. If virtio-gpu were to allocate the buffer we might as well
> > > >> forget
> > > >> all this and continue using the ivshmem device.
> > > >
> > > > I don't understand why virtio-gpu couldn't allocate those buffers.
> > > > Allocation doesn't necessarily mean creating new memory. Since the
> > > > virtio-gpu device on the host talks to the GPU driver (or DirectX?),
> > > > why couldn't it return one of the buffers provided by those if
> > > > BIND_SCANOUT is requested?
> > > >
> > >
> > > Because in our application we are a user-mode application in windows
> > > that is provided with buffers that were allocated by the video stack in
> > > windows. We are not using a virtual GPU but a physical GPU via vfio
> > > passthrough and as such we are limited in what we can do. Unless I have
> > > completely missed what virtio-gpu does, from what I understand it's
> > > attempting to be a virtual GPU in its own right, which is not at all
> > > suitable for our requirements.
> >
> > Not necessarily. virtio-gpu in its basic shape is an interface for
> > allocating frame buffers and sending them to the host to display.
> >
> > It sounds to me like a PRIME-based setup similar to how integrated +
> > discrete GPUs are handled on regular systems could work for you. The
> > virtio-gpu device would be used like the integrated GPU that basically
> > just drives the virtual screen. The guest component that controls the
> > display of the guest (typically some sort of a compositor) would
> > allocate the frame buffers using virtio-gpu and then import those to
> > the vfio GPU when using it for compositing the parts of the screen.
> > The parts of the screen themselves would be rendered beforehand by
> > applications into local buffers managed fully by the vfio GPU, so
> > there wouldn't be any need to involve virtio-gpu there. Only the
> > compositor would have to be aware of it.
> >
> > Of course if your guest is not Linux, I have no idea if that can be
> > handled in any reasonable way. I know those integrated + discrete GPU
> > setups do work on Windows, but things are obviously 100% proprietary,
> > so I don't know if one could make them work with virtio-gpu as the
> > integrated GPU.
> >
> > >
> > > This discussion seems to have moved away completely from the original
> > > simple feature we need, which is to share a random block of guest
> > > allocated ram with the host. While it would be nice if it's contiguous
> > > ram, it's not an issue if it's not, and with udmabuf (now I understand
> > > it) it can be made to appear contigous if it is so desired anyway.
> > >
> > > vhost-user could be used for this if it is fixed to allow dynamic
> > > remapping, all the other bells and whistles that are virtio-gpu are
> > > useless to us.
> > >
> >
> > As far as I followed the thread, my impression is that we don't want
> > to have an ad-hoc interface just for sending memory to the host. The
> > thread was started to look for a way to create identifiers for guest
> > memory, which proper virtio devices could use to refer to the memory
> > within requests sent to the host.
> >
> > That said, I'm not really sure if there is any benefit of making it
> > anything other than just the specific virtio protocol accepting
> > scatterlist of guest pages directly.
> >
> > Putting the ability to obtain the shared memory itself, how do you
> > trigger a copy from the guest frame buffer to the shared memory?
>
> Adding Zach for more background on virtio-wl particular use cases.