Hello Daniel, 27.04.2022 17:50, Daniel Vetter пишет: > On Mon, Apr 18, 2022 at 10:18:54PM +0300, Dmitry Osipenko wrote: >> Hello, >> >> On 4/18/22 21:38, Thomas Zimmermann wrote: >>> Hi >>> >>> Am 18.04.22 um 00:37 schrieb Dmitry Osipenko: >>>> Replace drm_gem_shmem locks with the reservation lock to make GEM >>>> lockings more consistent. >>>> >>>> Previously drm_gem_shmem_vmap() and drm_gem_shmem_get_pages() were >>>> protected by separate locks, now it's the same lock, but it doesn't >>>> make any difference for the current GEM SHMEM users. Only Panfrost >>>> and Lima drivers use vmap() and they do it in the slow code paths, >>>> hence there was no practical justification for the usage of separate >>>> lock in the vmap(). >>>> >>>> Suggested-by: Daniel Vetter <daniel@xxxxxxxx> >>>> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@xxxxxxxxxxxxx> >>>> --- >> ... >>>> @@ -310,7 +306,7 @@ static int drm_gem_shmem_vmap_locked(struct >>>> drm_gem_shmem_object *shmem, >>>> } else { >>>> pgprot_t prot = PAGE_KERNEL; >>>> - ret = drm_gem_shmem_get_pages(shmem); >>>> + ret = drm_gem_shmem_get_pages_locked(shmem); >>>> if (ret) >>>> goto err_zero_use; >>>> @@ -360,11 +356,11 @@ int drm_gem_shmem_vmap(struct >>>> drm_gem_shmem_object *shmem, >>>> { >>>> int ret; >>>> - ret = mutex_lock_interruptible(&shmem->vmap_lock); >>>> + ret = dma_resv_lock_interruptible(shmem->base.resv, NULL); >>>> if (ret) >>>> return ret; >>>> ret = drm_gem_shmem_vmap_locked(shmem, map); >>> >>> Within drm_gem_shmem_vmap_locked(), there's a call to dma_buf_vmap() for >>> imported pages. If the exporter side also holds/acquires the same >>> reservation lock as our object, the whole thing can deadlock. We cannot >>> move dma_buf_vmap() out of the CS, because we still need to increment >>> the reference counter. I honestly don't know how to easily fix this >>> problem. There's a TODO item about replacing these locks at [1]. As >>> Daniel suggested this patch, we should talk to him about the issue. >>> >>> Best regards >>> Thomas >>> >>> [1] >>> https://www.kernel.org/doc/html/latest/gpu/todo.html#move-buffer-object-locking-to-dma-resv-lock >> >> Indeed, good catch! Perhaps we could simply use a separate lock for the >> vmapping of the *imported* GEMs? The vmap_use_count is used only by >> vmap/vunmap, so it doesn't matter which lock is used by these functions >> in the case of imported GEMs since we only need to protect the >> vmap_use_count. > > Apologies for the late reply, I'm flooded. > > I discussed this with Daniel Stone last week in a chat, roughly what we > need to do is: > > 1. Pick a function from shmem helpers. > > 2. Go through all drivers that call this, and make sure that we acquire > dma_resv_lock in the top level driver entry point for this. > > 3. Once all driver code paths are converted, add a dma_resv_assert_held() > call to that function to make sure you have it all correctly. > 4. Repeate 1-3 until all shmem helper functions are converted over. Somehow I didn't notice the existence of dma_resv_assert_held(), thank you for the suggestion :) > > 5. Ditch the 3 different shmem helper locks. > > The trouble is that I forgot that vmap is a thing, so that needs more > work. I think there's two approaches here: > - Do the vmap at import time. This is the trick we used to untangle the > dma_resv_lock issues around dma_buf_attachment_map() > - Change the dma_buf_vmap rules that callers must hold the dma_resv_lock. I'll consider this option for v6, thank you. I see now that you actually want to define the new rules for the dma-bufs in general and not only in the context of the DRM code, this now makes much more sense to me. > - Maybe also do what you suggest and keep a separate lock for this, but > the fundamental issue is that this doesn't really work - if you share > buffers both ways with two drivers using shmem helpers, then the > ordering of this vmap_count_mutex vs dma_resv_lock is inconsistent and > you can get some nice deadlocks. So not a great approach (and also the > reason why we really need to get everyone to move towards dma_resv_lock > as _the_ buffer object lock, since otherwise we'll never get a > consistent lock nesting hierarchy). The separate locks should work okay because it will be always the exporter that takes the dma_resv_lock. But I agree that it's less ideal than defining the new rules for dma-bufs since sometime you will take the resv lock and sometime not, potentially hiding bugs related to lockings. > The trouble here is that trying to be clever and doing the conversion just > in shmem helpers wont work, because there's a lot of cases where the > drivers are all kinds of inconsistent with their locking. > > Adding Daniel S, also maybe for questions it'd be fastest to chat on irc? My nickname is digetx on the #dri-devel channel, feel free to ping me if needed. Right now yours suggestions are clear to me, hence no extra questions. Thank you for the review.