Re: [RFC PATCH 0/6] Common preempt fences and semantics

Matthew Brost <matthew.brost@xxxxxxxxx> · Wed, 13 Nov 2024 07:34:33 -0800

On Wed, Nov 13, 2024 at 10:02:12AM +0100, Christian König wrote:
> Am 13.11.24 um 03:30 schrieb Matthew Brost:
> > [SNIP]
> > > > > If you're using gpuvm, just call drm_gpuvm_resv_add_fence. I assume AMD has a
> > > > > similarly simple call.
> > > > Nope, we try to avoid locking all BOs in the VM as hard as we can.
> > > > 
> > > Why? Calling in to perform fence conversion shouldn't be all that
> > > frequent and simplifies things.
> > > 
> > > Also, it's likely that only a few locks are involved, as not too many
> > > external BOs are mapped within a VM (i.e., most BOs share the VM's
> > > dma-resv lock).
> 
> The most common use case are multi GPU systems which share a lot of data in
> a NUMA cluster.
> 
> This configuration has almost all BOs shared between GPUs making locking the
> whole VM a task with massive overhead which should be avoided as much as
> possible.
> 

Let look into use cases on our end.

> > > > > Now the ordering works inherently in dma-resv and the scheduler. e.g. No
> > > > > need to track the last completion fences explictly in preempt fences.
> > > > I really don't think that this is a good approach. Explicitly keeping the
> > > > last completion fence in the pre-emption fence is basically a must have as
> > > > far as I can see.
> > > > 
> > > > The approach you take here looks like a really ugly hack to me.
> > > > 
> > > Well, I have to disagree; it seems like a pretty solid, common design.
> 
> What you basically do is to move the responsibility to signal fences in the
> right order from the provider of the fences to the consumer of it.
> 

Are there not ordering rules already built into dma-resv? This is just
extending those preempt fences.

I can maybe buy some of this argument, as if a random yahoo enables
signaling a preempt fence without properly going through dma-resv or the
scheduler, then yes, that could break things. But don't do that. In Xe,
we have exactly two places where these can be triggered: in the TTM BO
move vfunc (which the scheduler handles) and in the MMU invalidation
path (dma-resv).

I would expect all drivers and users of dma-resv and the scheduler with
preempt fences to use the proper interfaces to signal preempt fences,
rather than bypassing this thus ensuring the common rules for preempt
fences are adhered to.

> Since we have tons of consumers of that stuff this is not even remotely a
> defensive design.
>

I can consider other designs, but I do think larger community input may
be required, as you mentioned there are several consumers of this code.

> > > 
> > > Anyway, I think I have this more or less working. I want to run this by
> > > the Mesa team a bit to ensure I haven't missed anything, and will likely
> > > post something shortly after.
> > > 
> > > We can discuss this more after I post and perhaps solicit other
> > > opinions, weighing the pros and cons of the approaches here. I do think
> > > they function roughly the same, so something commonly agreed upon would
> > > be good. Sharing a bit of code, if possible, is always a plus too.
> 
> Well to make it clear that will never ever get a green light from my side as
> DMA-buf maintainer. What you suggest here is extremely fragile.
> 

It is unfortunate that this is your position, and I do feel it is a bit
premature to suggest as much. I didn’t know being a maintainer was akin
to being a dictator. As I’ve said multiple times, I feel this warrants a
bit more discussion with more stakeholders. If this is unacceptable,
sure, I can change it.

> Why not simply wait for the pending completion fences as dependency for
> signaling preemption fences?
>
> That should work for all drivers and is trivial to implement as far as I can
> see.

Again, this is hard to understand without a clear picture of what AMD is
doing. I pointed out a dma-fencing problem in the code on the list, and
the response was, "Well, we have some magic ordering rules that make it
safe." That might be true, but if you annotated your code, lockdep would
complain. Anything that cannot be annotated is a non-starter for me.
This makes me nervous that, in fact, it is not as trivial as you
suggest, nor is the design as sound as you believe.

Anyways, I'll still likely post a common solution with annotations. If
it is rejected, so be it, and I will rework it.

In spirit of open development here is my code in a public branch:

Kernel: https://gitlab.freedesktop.org/mbrost/xe-kernel-driver-umd-submission/-/tree/v2-11-13-24?ref_type=heads
IGT: https://gitlab.freedesktop.org/mbrost/igt-gpu-tools-umd-submission/-/tree/umd_submission.v2?ref_type=heads

Matt

> 
> Regards,
> Christian.
> 
> > > 
> > > Matt
> > > 
> > > > Regards,
> > > > Christian.
> > > >