Re: [PATCH 4/4] RFC: dma-buf: Add an API for importing sync files (v6)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 26, 2021 at 1:09 PM Daniel Stone <daniel@xxxxxxxxxxxxx> wrote:
>
> Hey,
>
> On Mon, 24 May 2021 at 18:11, Jason Ekstrand <jason@xxxxxxxxxxxxxx> wrote:
> > On Sat, May 22, 2021 at 3:05 PM Daniel Stone <daniel@xxxxxxxxxxxxx> wrote:
> > > So far, so good. I really like both your series up until this
> > > narrative point; as I was saying in the userspace-fence thread, the
> > > WSI<->client thread is certainly pulling a very big lever with a
> > > heavyweight transition between the two different worlds, and I like
> > > that the new export ioctl lets us be very clear about what exactly is
> > > happening under the hood. Good stuff.
> >
> > Glad to hear.  If you turned that into RBs on the first three patches
> > in this series and all but the last patch on the Mesa MR, it'd make me
> > even happier. :-D
> >
> > At this point, I think everyone is pretty happy with the first three
> > patches and the export ioctl.  In the Vulkan WSI code, it solves a
> > significant over-synchronization issue for ANV.  Also, the uAPI
> > shouldn't be controversial at all because it's identical to poll()
> > except that it gives you a FD you can poll on later to get the result
> > of the poll as it would be now.  I think if we get some Mesa reviews,
> > we should be able to land those.  It's import that's trickier.
>
> Agree. Have a few more burning things on my list (not least fd.o infra
> this week) but will get to it so we can land.
>
> (Clarified on IRC that my description above was accurate to the best
> of our shared understanding, so we're all on the same page here.)

fd.o work is much appreciated!

> > > That makes sense to me and is nicely symmetrical, plus it gets GPU
> > > drivers further out of the business of doing magic winsys/display
> > > synchronisation, which is good. But why enforce that imported fences
> > > have to go into the exclusive slot? It makes sense from the point of
> > > view of clients doing QueueSubmit, but I think it might preclude other
> > > uses for no particularly good reason.
> >
> > Mostly, I was trying to limit the scope a bit.  Import is tricky and,
> > to be honest, I'm still not 100% sure that it's safe.  I probably
> > should have left RFC on this patch.
> >
> > As long as we keep everything inside the kernel, there's a requirement
> > that any work which triggers any fence on a dma_resv waits on the
> > exclusive fence, if any.  Work which sets the exclusive fence has to
> > wait on all fences.  With the import ioctl, we can guarantee that the
> > fences fire in the right order by making an array fence containing the
> > new fence and all other fences and using that as the exclusive fence.
> > What we can't do, however, is ensure that the work which triggers the
> > fence actually blocks on ANY of the fences on the dma_resv.
> >
> > Hrm... I think I've now convinced myself that importing shared fences
> > is no more dangerous than importing an exclusive fence because they
> > share the same set of problems.  Unfortunately, I've unconvinced
> > myself of the safety of either.  I've got to think some more....
> >
> > The most convincing argument I can make to myself is that this ioctl
> > is like having a one-shot little queue that contains tiny little work
> > jobs which wait on whatever sync_file you provide, do nothing, and
> > then signal.  That should be safe, right?
>
> Yeah, I don't think there's any difference between shared and
> exclusive wrt safety. The difference lies in, well, exclusive putting
> a hard serialisation barrier between everything which comes before and
> everything that comes after, and shared being more relaxed to allow
> for reads to retire in parallel.
>
> As said below, I think there's a good argument for the latter once you
> get out of the very straightforward uses. One of the arguments for
> these ioctls is to eliminate oversync, but then the import ioctl
> mandates oversync in the case where the consumer only does
> non-destructive reads - which is the case for the vast majority of
> users!

Just wanted to comment on this: Right now we attach always attach a
shared end-of-batch fence to every dma_resv. So reads are
automatically and always synced. So in that sense having an explicit
ioct to set the read fence is not really useful, since at most you
just make everything worse.

Until we have userspace memory this is also pretty much guaranteed to
stay like that, except if we maybe split up the shared fences into
"relevant for implicit sync" and "not so relevant for implicit sync".
amdgpu has an interesting model there with only engaging in implicit
sync if it's a fence from a different drm_file (but they lack the
distinction between read/write and oversync due to that in other
cases, oops). I haven't wrapped my head around how to glue those two
approaches together.

So anyway right now all we really have on the table for explicit
control of the implicit fences is really just:
- should I stall for the exclusive fence/all fences/not at all
- should I set the exclusive fence

And from my additional auditing this week we already have plenty of
small bugs in this area :-( So lifting restrictions more on what you
can do with handling implicit fences more explicitly is going to be
major surgery ... Implicit fencing is bound to stay a very funny IPC
for fences for quite some time I think.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch



[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux