Re: [PATCH 1/3] drm/amdgpu: Refuse to create a KMS FB for non-P2P exported dma-bufs

Michel Dänzer <michel@xxxxxxxxxxx> · Fri, 23 Feb 2024 11:04:27 +0100

On 2024-02-23 10:34, Christian König wrote:
> Am 23.02.24 um 09:11 schrieb Michel Dänzer:
>> On 2024-02-23 08:06, Christian König wrote:
>>> Am 22.02.24 um 18:28 schrieb Michel Dänzer:
>>>> From: Michel Dänzer <mdaenzer@xxxxxxxxxx>
>>>>
>>>> Pinning the BO storage to VRAM for scanout would make it inaccessible
>>>> to non-P2P dma-buf importers.
>>> Thinking more about it I don't think we can do this.
>>>
>>> Using the BO in a ping/pong fashion for scanout and DMA-buf is actually valid, you just can't do both at the same time.
>>>
>>> And if I'm not completely mistaken we actually have use cases for this at the moment,
>> Those use cases don't have P2P & CONFIG_DMABUF_MOVE_NOTIFY?
> 
> Nope, we are basically talking about unit tests and examples for inter device operations.

Sounds like the "no user-space regressions" rule might not apply then.

> Those render into a shared buffer and then display it to check if the content was rendered/transferred correctly.

That can be fixed by dropping the dma-buf attachments from other GPUs before creating the KMS FB.

Conversely, tests / examples which do scanout first can be fixed by destroying KMS FBs before sharing the BO with another GPU.

> I'm not sure if we still do those test cases, the last time I looked into it was before P2P was even supported, but I also can't rule it out.

Sounds too vague to block this series.

>>> So rejecting things during CS and atomic commit is the best thing we can do.
>> It's problematic for a Wayland compositor:
>>
>> The CS ioctl failing is awkward. With GL, I'm pretty sure it means the compositor would have to destroy the context and create a new one. Not sure about Vulkan, but I suspect it's painful there as well.
>>
>> Similarly for the KMS atomic commit ioctl. The compositor can't know why exactly it failed, all it gets is an error code.
> 
> Yeah, but that is not because the kernel is doing anything wrong.
> 
> Sharing, rendering and then doing an atomic commit is a perfectly valid use case.
> 
> You just can't do scanout and sharing at the same time.

Per my later follow-up, Xwayland can't really avoid it.

>> And there's no other way for the compositor to detect when both things can actually work concurrently.
> 
> That I totally agree with. And IIRC we already have at least a query for the buffer placement. E.g. you can already check if the BO is in GTT or VRAM and shared.
> 
> What's missing is exposing if the device can scanout from GTT or not.

Requiring Wayland compositors to have driver-specific knowledge like that baked in isn't great either.

-- 
Earthling Michel Dänzer            |                  https://redhat.com
Libre software enthusiast          |         Mesa and Xwayland developer