On Wed, 13 Sep 2023 17:05:42 +1000 Dave Airlie <airlied@xxxxxxxxx> wrote: > On Wed, 13 Sept 2023 at 17:03, Boris Brezillon > <boris.brezillon@xxxxxxxxxxxxx> wrote: > > > > On Tue, 12 Sep 2023 18:20:32 +0200 > > Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> wrote: > > > > > > +/** > > > > + * get_next_vm_bo_from_list() - get the next vm_bo element > > > > + * @__gpuvm: The GPU VM > > > > + * @__list_name: The name of the list we're iterating on > > > > + * @__local_list: A pointer to the local list used to store already iterated items > > > > + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo() > > > > + * > > > > + * This helper is here to provide lockless list iteration. Lockless as in, the > > > > + * iterator releases the lock immediately after picking the first element from > > > > + * the list, so list insertion deletion can happen concurrently. > > > > > > Are the list spinlocks needed for that async state update from within > > > the dma-fence critical section we've discussed previously? > > > > Any driver calling _[un]link() from its drm_gpu_scheduler::run_job() > > hook will be in this situation (Panthor at the moment, PowerVR soon). I > > get that Xe and Nouveau don't need that because they update the VM > > state early (in the ioctl path), but I keep thinking this will hurt us > > if we don't think it through from the beginning, because once you've > > set this logic to depend only on resv locks, it will be pretty hard to > > get back to a solution which lets synchronous VM_BINDs take precedence > > on asynchronous request, and, with vkQueueBindSparse() passing external > > deps (plus the fact the VM_BIND queue might be pretty deep), it can > > take a long time to get your synchronous VM_BIND executed... > > btw what is the use case for this? do we have actual vulkan > applications we know will have problems here? I don't, but I think that's a concern Faith raised at some point (dates back from when I was reading threads describing how VM_BIND on i915 should work, and I was clearly discovering this whole VM_BIND thing at that time, so maybe I misunderstood). > > it feels like a bit of premature optimisation, but maybe we have use cases. Might be, but that's the sort of thing that would put us in a corner if we don't have a plan for when the needs arise. Besides, if we don't want to support that case because it's too complicated, I'd recommend dropping all the drm_gpuvm APIs that let people think this mode is valid/supported (map/remap/unmap hooks in drm_gpuvm_ops, drm_gpuvm_sm_[un]map helpers, etc). Keeping them around just adds to the confusion.