Re: Tackling the indefinite/user DMA fence problem

Daniel Vetter <daniel@xxxxxxxx> · Wed, 25 May 2022 15:51:27 +0200



On Wed, May 25, 2022 at 03:28:41PM +0200, Michel Dänzer wrote:
> On 2022-05-25 15:05, Daniel Vetter wrote:
> > On Tue, May 17, 2022 at 12:28:17PM +0200, Christian König wrote:
> >> Am 09.05.22 um 16:10 schrieb Daniel Vetter:
> >>> On Mon, May 09, 2022 at 08:56:41AM +0200, Christian König wrote:
> >>>> Am 04.05.22 um 12:08 schrieb Daniel Vetter:
> >>>>>
> >>>>> If the goal is specifically atomic kms, then there's an entire can of
> >>>>> worms there that I really don't want to think about, but it exists: We
> >>>>> have dma_fence as out-fences from atomic commit, and that's already
> >>>>> massively broken since most drivers allocate some memory or at least take
> >>>>> locks which can allocate memory in their commit path. Like i2c. Putting a
> >>>>> userspace memory fence as in-fence in there makes that problem
> >>>>> substantially worse, since at least in theory you're just not allowed to
> >>>>> might_faul in atomic_commit_tail.
> >>>> Yes, that's unfortunately one of the goals as well and yes I completely
> >>>> agree on the can of worms. But I think I've solved that.
> >>>>
> >>>> What I do in the patch set is to enforce that the out fence is an user fence
> >>>> when the driver supports user in fences as well.
> >>>>
> >>>> Since user fences doesn't have the memory management dependency drivers can
> >>>> actually allocate memory or call I2C functions which takes locks which have
> >>>> memory allocation dependencies.
> >>>>
> >>>> Or do I miss some other reason why you can't fault or allocate memory in
> >>>> atomic_commit_tail? At least lockdep seems to be happy about that now.
> >>> The problem is a bit that this breaks the uapi already. At least if the
> >>> goal is to have this all be perfectly transparent for userspace - as you
> >>> as you have multi-gpu setups going on at least.
> >>
> >> Question here is why do you think there is an UAPI break? We currently wait
> >> in a work item already, so where exactly is the problem?
> > 
> > It's a bit washy, but dma_fence and hence implicit sync is supposed to
> > finish in finite time. umf just doesn't.
> > 
> > Ofc in reality you can still flood your compositor and they're not very
> > robust, but with umf it's trivial to just hang your compositor forever and
> > nothing happens.
> 
> You can add that to the list of reasons why compositors need to stop
> using buffers with unsignaled fences. There's plenty of other reasons
> there already (the big one being that otherwise slow clients can slow
> down the compositor, even if the compositor uses a high priority context
> and the HW supports preemption).

Yeah that's tbh another reason why I think we shouldn't do umf as a
transparent thing - compositors need to get better anyway, so we might as
well take this as a chance to do this right.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch