On Tue, 20 Jun 2023 11:14:51 +0200 Christian König <ckoenig.leichtzumerken@xxxxxxxxx> wrote: > Am 20.06.23 um 11:09 schrieb Boris Brezillon: > > On Tue, 20 Jun 2023 10:44:26 +0200 > > Christian König <ckoenig.leichtzumerken@xxxxxxxxx> wrote: > > > >> Am 20.06.23 um 10:28 schrieb Boris Brezillon: > >>> On Tue, 20 Jun 2023 10:12:13 +0200 > >>> Christian König <ckoenig.leichtzumerken@xxxxxxxxx> wrote: > >>> > >>>>> I think Boris's suggestion of having this through a common > >>>>> DRM_EXEC_FLAG_ALLOW_DUPLICATES flag fits well. > >>>> No, again. The only driver which should accept duplicates is radeon, for > >>>> all other drivers especially new ones duplicates should probably be > >>>> rejected. > >>>> > >>>> We only allow this for radeon because it is already UAPI, could be that > >>>> we need to do this for amdgpu as well but I really hope we don't need this. > >>> Just want to describe the use case we have: we support submission in > >>> batch (several jobs passed to the submit ioctl) with a > >>> submit-all-or-nothing model: if any of the job description is passed > >>> wrong args or causes an allocation error, we fail the whole group. In > >>> the submission path, we want to prepare GEMs for all jobs. That means > >>> adding enough fence slots for the number job finished fences. Given not > >>> all jobs will access the same set of BOs, I thought I could use > >>> duplicates support to make my life easier, because otherwise I have to > >>> collect all BOs upfront, store them in a temporary array, and keep > >>> track of the number of fence slots needed for each of them. I guess > >>> the other option would be to over-estimate the number of slots and make > >>> it equal to num_jobs for all BOs. > >> Sounds pretty much what amdgpu is doing as well, but question is why > >> don't you give just one list of BOs? Do you really want to add the > >> fences that fine grained? > > Actually, we don't give a list of BOs at all, we pass a VM, and lock > > all BOs attached to the VM (similar to what Xe does). And, as all other > > drivers being submitted recently, we use explicit sync, so most of > > those VM BOs, except for the imported/exported ones, will be given a > > BOOKKEEP fence. > > > > The reason we need support for duplicates is because we also have > > implicit BOs (like the HWRT object that's shared by the > > geometry/fragment queues to pass data around), and those can be passed > > to multiple jobs in a given batch and require special synchronization > > (geometry job writes to them, fragment job reads from them, so we have > > a reader/writer sync to express). I can of course de-duplicate upfront, > > by parsing jobs and creating an array of BOs that need to be acquired > > over the whole submission, but that's still one extra-step I'd prefer > > to avoid, given the dma_resv framework allows us to figure it out at > > lock time. I can also just deal with the EALREADY case in the driver > > directly, it's not like it's super complicated anyway, just thought > > other drivers would fall in the same situation, that's all. > > Well as long as you just need to ignore EALREADY, that should be trivial > and doable. Oh, yeah, that's all I need really. We probably don't want to add the GEM object a second time in the array though, hence the goto reserve_fences in my proposal when EALREADY is returned. > > What radeon needs is to keep EALREADY BOs in a separate container > because we need to double check their properties to not break the UAPI. > > I strongly think that this shouldn't be needed by any other driver. > > Going to add a flag to ignore EALREADY which can be set during exec init. Thanks!