On 7/31/23 15:35, Boris Brezillon wrote:
+Danilo, to confirm my understanding of the gpuva remap operation is
correct.
Your understanding is correct.
Unfortunately, re-mapping things has such implications.
I'm currently working on tracking external GEM objects in the GPUVA
manager, where, ideally, you'd want to add the extobj to the VM when the
first mapping being backed by this GEM is created and removed when the
last mapping being backed by this GEM is removed. Hence, extobjs need to
be ref-counted based on how many mappings they back.
However, when re-mapping such a mapping, the reference counter might
drop to 0 temporarily and the slot of the data structure tracking the
extobj is cleaned up and needs to be re-allocated. Surely, we could just
increase the reference count while re-mapping or for the whole
transaction (job), but this would make the API kinda bulky.
On Mon, 31 Jul 2023 15:27:31 +0300
Dmitry Osipenko <dmitry.osipenko@xxxxxxxxxxxxx> wrote:
On 7/25/23 11:32, Boris Brezillon wrote:
Can we make it an atomic_t, so we can avoid taking the lock when the
GEM has already been pinned. That's something I need to be able to grab
a pin-ref in a path where the GEM resv lock is already held[1]. We could
of course expose the locked version,
My bad, that's actually not true. The problem is not that I call
drm_gem_shmem_pin() with the resv lock already held, but that I call
drm_gem_shmem_pin() in a dma-signaling path where I'm not allowed to
take a resv lock. I know for sure pin_count > 0, because all GEM objects
mapped to a VM have their memory pinned right now, and this should
stand until we decide to add support for live-GEM eviction, at which
point we'll probably have a way to detect when a GEM is evicted, and
avoid calling drm_gem_shmem_pin() on it.
TLDR; I can't trade the atomic_t for a drm_gem_shmem_pin_locked(),
because that wouldn't solve my problem. The other solution would be to
add an atomic_t at the driver-GEM level, and only call
drm_gem_shmem_[un]pin() on 0 <-> 1 transitions, but I thought using an
atomic at the GEM-shmem level, to avoid locking when we can, would be
beneficial to the rest of the eco-system. Let me know if that's not an
option, and I'll go back to the driver-specific atomic_t.
Could you please explain why do you need to pin GEM in a signal handler?
This is not something drivers usually do or need to do. You likely also
shouldn't need to detect that GEM is evicted in yours driver. I'd expect
that Panthor shouldn't differ from Panfrost in regards to how GEM memory
management is done and Panfrost doesn't need to do anything special.
Panthor VM management is completely different, and the case I'm
referring to is 'asynchronous VM_BIND': mapping a GEM object to a GPU VM
asynchronously, so we can make it depend on other operations, encoded as
syncobjs passed to the VM_BIND operation.
Here is the workflow we have for this use case:
1. Create + push a VM_BIND job to the VM_BIND queue (a drm_sched_entity
that's taking care of asynchronous VM map/unmap operations). Because
this operation is asynchronous, and the execution itself happens in a
dma-signaling path (drm_sched::run_job()), we need to pre-allocate the
MMU page tables for the worst case scenario, and make sure the GEM pages
are pinned at job creation time.
2. The VM operation itself is executed when all dependencies are met
(drm_sched calls run_job()). In case of a map operation, we call
drm_gpuva_sm_map(), which might split the map operation into
remap+unamp+map ones if the region being mapped is covering a region
that was previously mapped to a different GEM object or a different
portion of the same GEM object (see the gpuva_mgr doc [1]). A
remap operation is just a way to split an existing mapping in 2 mappings
covering the left/right side of the previous mapping, plus a hole in
the middle. This means that our VM mapping object (drm_gpuva), which
was pointing to a GEM object that had its pages pinned, is now turned
into 2 mapping objects, and we need to make sure those 2 mappings own a
reference to the pages, otherwise we'll have an unbalanced refcount
when we release those 2 mappings further down the road.
3. Release resources attached to mappings that were removed (that
includes releasing the ref we had on GEM pages) and free the mapping
objects. We do that asynchronously, outside of the dma-signaling path.
Note that patch #14 makes locked pin/unpin functions public and turns
the unlocked variants into helpers, you'll be able to experiment with
these funcs in the Panthor driver.
Unfortunately, those won't help. I really need a way to increment the
refcount without holding the lock, because we're in a dma-signaling
path when we call drm_gpuva_sm_map(). Note that I could live with a
drm_shmem_gem_pin_if_already_pinned() variant that would return NULL if
pin_count == 0 instead of trying to acquire the lock, but I'd still
need this refcount to be an atomic_t.
As I said, an alternative to this approach would be to have a separate
atomic refcount at the panthor_gem_object level, but I feel like we'd
just be duplicating something that exists already.
[1]https://cgit.freedesktop.org/drm/drm-misc/tree/drivers/gpu/drm/drm_gpuva_mgr.c#n67