Hi Faith, Sorry for the late reply, I only got back to panthor very recently. On Mon, 11 Dec 2023 12:18:04 -0600 Faith Ekstrand <faith.ekstrand@xxxxxxxxxxxxx> wrote: > On Mon, 2023-12-11 at 09:52 +0100, Boris Brezillon wrote: > > Hi, > > > > On Sun, 10 Dec 2023 13:58:51 +0900 > > Tatsuyuki Ishi <ishitatsuyuki@xxxxxxxxx> wrote: > > > > > > On Dec 5, 2023, at 2:32, Boris Brezillon > > > > <boris.brezillon@xxxxxxxxxxxxx> wrote: > > > > > > > > Hello, > > > > > > > > This is the 3rd version of the kernel driver for Mali CSF-based > > > > GPUs. > > > > > > > > With all the DRM dependencies being merged (drm-sched single > > > > entity and > > > > drm_gpuvm), I thought now was a good time to post a new version. > > > > Note > > > > that the iommu series we depend on [1] has been merged recently. > > > > The > > > > only remaining dependency that hasn't been merged yet is this > > > > rather > > > > trival drm_gpuvm [2] patch. > > > > > > > > As for v2, I pushed a branch based on drm-misc-next and > > > > containing > > > > all the dependencies that are not yet available in drm-misc-next > > > > here[3], and another [4] containing extra patches to have things > > > > working on rk3588. The CSF firmware binary can be found here[5], > > > > and > > > > should be placed under > > > > /lib/firmware/arm/mali/arch10.8/mali_csffw.bin. > > > > > > > > The mesa MR adding v10 support on top of panthor is available > > > > here [6]. > > > > > > > > Regarding the GPL2+MIT relicensing, Liviu did an audit and found > > > > two > > > > more people that I didn't spot initially: Clément Péron for the > > > > devfreq > > > > code, and Alexey Sheplyakov for some bits in panthor_gpu.c. Both > > > > are > > > > Cc-ed on the relevant patches. The rest of the code is either > > > > new, or > > > > covered by the Linaro, Arm and Collabora acks. > > > > > > > > And here is a non-exhaustive changelog, check each commit for a > > > > detailed > > > > changelog. > > > > > > > > v3; > > > > - Quite a few changes at the MMU/sched level to make the fix some > > > > race conditions and deadlocks > > > > - Addition of the a sync-only VM_BIND operation (to support > > > > vkQueueSparseBind with zero commands). > > > > > > Hi Boris, > > > > > > Just wanted to point out that vkQueueBindSparse's semantics is > > > rather different > > > from vkQueueSubmit when it comes to synchronization. In short, > > > vkQueueBindSparse does not operate on a particular timeline (aka > > > scheduling > > > queue / drm_sched_entity). The property of following a timeline > > > order is known > > > as “submission order” [1] in Vulkan, and applies to vkQueueSubmit > > > only and not > > > vkQueueBindSparse. > > > > Hm, okay. I really though the same ordering guarantees applied to > > sparse binding queues too, as the spec [1] says > > > > " > > Batches begin execution in the order they appear in pBindInfo, but > > may complete out of order. > > " > > Right. So this is something where the Vulkan spec isn't terribly clear > and I think I need to go file a spec bug. I'm fairly sure that the > intent is that bind operations MAY complete out-of-order but are not > required to complete out-of-order. In other words, I'm fairly sure > that implementations are allowed but not required to insert extra > dependencies that force some amount of ordering. We take advantage of > this in Mesa today to properly implement vkQueueWaitIdle() on sparse > binding queues. Do I get it correctly that, for Mesa's generic vk_common_QueueWaitIdle() implementation to work correctly, we not only need to guarantee in-order submission, but also in-order completion on the queue. I mean, that's no problem for panvk/panthor, because that's how it's implemented anyway, but I didn't realize this constraint existed until you mentioned it. > This is also a requirement of Windows WDDM2 as far as > I can tell so I'd be very surprised if we disallowed it in the spec. > > That said, that's all very unclear and IDK where you'd get that from > the spec. I'm going to go file a spec bug so we can get this sorted > out. > > ~Faith > > > > which means things are submitted in order inside a vkQueueSparseBind > > context, so I was expecting the submission ordering guarantee to > > apply > > across vkQueueSparseBind() calls on the same queue too. Just want to > > mention that all kernel implementations I have seen so far assume > > VM_BIND submissions on a given queue are serialized (that's how > > drm_sched works, and Xe, Nouveau and Panthor are basing their VM_BIND > > implemenation on drm_sched). > > > > Maybe Faith, or anyone deeply involved in the Vulkan specification, > > can > > confirm that submission ordering guarantees are relaxed on sparse > > binding queues. > > > > > > > > Hence, an implementation that takes full advantage of Vulkan > > > semantics would > > > essentially have an infinite amount of VM_BIND contexts. > > > > Uh, that's definitely not how I understood it initially... > > > > > It would also not need > > > sync-only VM_BIND submissions, assuming that drmSyncobjTransfer > > > works. > > > > Sure, if each vkQueueSparseBind() has its own timeline, an internal > > timeline-syncobj with a bunch of drmSyncobjTransfer() calls would do > > the > > trick (might require several ioctls() to merge input syncobjs, but > > that's probably not the end of the world). Back to the kernel-side implementation. As Tatsuyuki pointed out, we can always replace the sync-only VM_BIND (no actual operation on the VM, just a bunch of wait/signal sync operations) with X calls to drmSyncobjTransfer() and a mesa-driver-specific timeline syncobj that's used to consolidate all the operations we have submitted so far. But given Nouveau supports this sync-only operation (at least that's my understanding of the nouveau_uvmm.c code and how VM_BIND is currently used in NVK), I guess there are good reasons to support that natively. Besides the potential optimization on the number of ioctl() calls, and the fact adding support for VM_BIND with zero VM ops is trivial enough that it's probably worth adding to simplify the UMD implementation, is there anything else I'm missing? Regards, Boris