Re: A few questions about the best way to implement RandR 1.4 / PRIME buffer sharing

Aaron Plattner <aplattner@xxxxxxxxxx> · Tue, 4 Sep 2012 13:57:32 -0700

On 08/31/2012 08:00 PM, Dave Airlie wrote:
object interface, so that Optimus-based laptops can use our driver to drive
the discrete GPU and display on the integrated GPU.  The good news is that
I've got a proof of concept working.

Don't suppose you'll be interested in adding the other method at some point as
well? since saving power is probably important to a lot of people

That's milestone 2.  I'm focusing on display offload to start because it's
easier to implement and lays the groundwork for the kernel pieces.  I have to
emphasize that I'm just doing a feasibility study right now and I can't promise
that we're going to officially support this stuff.

During a review of the current code, we came up with a few concerns:

1. The output source is responsible for allocating the shared memory

Right now, the X server calls CreatePixmap on the output source screen and
then expects the output sink screen to be able to display from whatever memory
the source allocates.  Right now, the source has no mechanism for asking the
sink what its requirements are for the surface.  I'm using our own internal
pitch alignment requirements and that seems to be good enough for the Intel
device to scan out, but that could be pure luck.

Well in theory it might be nice but it would have been premature since so far
the only interactions for prime are combination of intel, nvidia and AMD, and
I think everyone has fairly similar pitch alignment requirements, I'd be
interested in adding such an interface but I don't think its some I personally
would be working on.

Okay.  Hopefully that won't be too painful to add if we ever need it in the
future.

other, or is it sufficient to just define a lowest common denominator format
and if your hardware can't deal with that format, you just don't get to share
buffers?

At the moment I'm happy to just go with linear, minimum pitch alignment 64 or

256, for us.

something as a base standard, but yeah I'm happy for it to work either way,
just don't have enough evidence it's worth it yet. I've not looked at ARM
stuff, so patches welcome if people consider they need to use this stuff for
SoC devices.

We can always hack it to whatever is necessary if we see that the sink side
driver is Tegra, but I was hoping for something more general.

2. There's no fallback mechanism if sharing can't be negotiated

If RandR fails to share a pixmap with the output sink screen, the whole
modeset fails.  This means you'll end up not seeing anything on the screen and
you'll probably think your computer locked up.  Should there be some sort of
software copy fallback to ensure that something at least shows up on the
display?

Uggh, it would be fairly slow and unuseable, I'd rather they saw nothing, but
again open to suggestions on how to make this work, since it might fail for
other reasons and in that case there is still nothing a sw copy can do. What
happens if the slave intel device just fails to allocate a pixmap, but yeah
I'm willing to think about it a bit more when we have some reference
implementations.

Just rolling back the modeset operation to whatever was working before would be
a good start.

It's worse than that on my current laptop, though, since our driver sees a
phantom CRT output and we happily start driving pixels to it that end up going
nowhere.  I'll need to think about what the right behavior is there since I
don't know if we want to rely on an X client to make that configuration work.

3. How should the memory be allocated?

In the prototype I threw together, I'm allocating the shared memory using
shm_open and then exporting that as a dma-buf file descriptor using an ioctl I
added to the kernel, and then importing that memory back into our driver
through dma_buf_attach & dma_buf_map_attachment.  Does it make sense for
user-space programs to be able to export shmfs files like that?  Should that
interface go in DRM / GEM / PRIME instead?  Something else?  I'm pretty
unfamiliar with this kernel code so any suggestions would be appreciated.

Your kernel driver should in theory be doing it all, if you allocate shared
pixmaps in GTT accessible memory, then you need an ioctl to tell your kernel
driver to export the dma buf to the fd handle.  (assuming we get rid of the
_GPL, which people have mentioned they are open to doing). We have handle->fd
and fd->handle interfaces on DRM, you'd need something similiar on the nvidia
kernel driver interface.

Okay, I can do that.  We already have a mechanism for importing buffers
allocated elsewhere so reusing that for shmfs and/or dma-buf seemed like a
natural extension.  I don't think adding a separate ioctl for exporting our own
allocations will add too much extra code.

Yes for 4 some sort of fencing is being worked on by Maarten for other stuff
but would be a pre-req for doing this, and also some devices don't want
fullscreen updates, like USB, so doing flipped updates would have to be
optional or negoitated. It makes sense for us as well since things like
gnome-shell can do full screen pageflips and we have to do full screen dirty
updates.

Right now my implementation has two sources of tearing:

1. The dGPU reads the vidmem primary surface asynchronously from its own
   rendering to it.

2. The iGPU fetches the shared surface for display asynchronously from the dGPU
   writing into it.

#1 I can fix within our driver.  For #2, I don't want to rely on the dGPU being
able to push complete frames over the bus during vblank in response to an iGPU
fence trigger so I was thinking we would want double-buffering all the time.
Also, I was hoping to set up a proper flip chain between the dGPU, the dGPU's
DMA engine, and the Intel display engine so that for full-screen applications,
glXSwapBuffers is stalled properly without relying on the CPU to schedule
things.  Maybe that's overly ambitious for now?

-- Aaron
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel