Re: [PATCH 03/15] dma-buf & drm/amdgpu: remove dma_resv workaround

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 20.04.22 um 20:49 schrieb Christian König:
Am 20.04.22 um 20:41 schrieb Zack Rusin:
On Wed, 2022-04-20 at 19:40 +0200, Christian König wrote:
Am 20.04.22 um 19:38 schrieb Zack Rusin:
On Wed, 2022-04-20 at 09:37 +0200, Christian König wrote:
⚠ External Email

Hi Zack,

Am 20.04.22 um 05:56 schrieb Zack Rusin:
On Thu, 2022-04-07 at 10:59 +0200, Christian König wrote:
Rework the internals of the dma_resv object to allow adding
more
than
one
write fence and remember for each fence what purpose it had.

This allows removing the workaround from amdgpu which used a
container
for
this instead.

Signed-off-by: Christian König <christian.koenig@xxxxxxx>
Reviewed-by: Daniel Vetter <daniel.vetter@xxxxxxxx>
Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx
afaict this change broke vmwgfx which now kernel oops right
after
boot.
I haven't had the time to look into it yet, so I'm not sure
what's
the
problem. I'll look at this tomorrow, but just in case you have
some
clues, the backtrace follows:
that's a known issue and should already be fixed with:

commit d72dcbe9fce505228dae43bef9da8f2b707d1b3d
Author: Christian König <christian.koenig@xxxxxxx>
Date:   Mon Apr 11 15:21:59 2022 +0200
Unfortunately that doesn't seem to be it. The backtrace is from the
current (as of the time of sending of this email) drm-misc-next,
which
has this change, so it's something else.
Ok, that's strange. In this case I need to investigate further.

Maybe VMWGFX is adding more than one fence and we actually need to
reserve multiple slots.
This might be helper code issue with CONFIG_DEBUG_MUTEXES set. On that config
dma_resv_reset_max_fences does:
    fences->max_fences = fences->num_fences;
For some objects num_fences is 0 and so after max_fences and num_fences are both 0.
And then BUG_ON(num_fences >= max_fences) is triggered.

Yeah, but that's expected behavior.

What's not expected is that max_fences is still 0 (or equal to old num_fences) when VMWGFX tries to add a new fence. The function ttm_eu_reserve_buffers() should have reserved at least one fence slot.

So the underlying problem is that either ttm_eu_reserve_buffers() was never called or VMWGFX tried to add more than one fence.


To figure out what it is could you try the following code fragment:

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
index f46891012be3..a36f89d3f36d 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c
@@ -288,7 +288,7 @@ int vmw_validation_add_bo(struct vmw_validation_context *ctx,
                val_buf->bo = ttm_bo_get_unless_zero(&vbo->base);
                if (!val_buf->bo)
                        return -ESRCH;
-               val_buf->num_shared = 0;
+               val_buf->num_shared = 16;
                list_add_tail(&val_buf->head, &ctx->bo_list);
                bo_node->as_mob = as_mob;
                bo_node->cpu_blit = cpu_blit;

Thanks,
Christian.


Regards,
Christian.


z






[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux