[AMD Official Use Only - General] Nack to the revert. This is a whole series of doorbell changes, which has replacement for the patches too. https://patchwork.freedesktop.org/series/115802/ The whole series was pushed to staging branch, but looks like some of the patches did not merge. When the whole series is merged, functionality will be restored. I will check if there is any problem due to which other patches are blocked merge. Regards Shashank -----Original Message----- From: Zhang, Yifan <Yifan1.Zhang@xxxxxxx> Sent: Friday, August 4, 2023 7:01 AM To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>; Sharma, Shashank <Shashank.Sharma@xxxxxxx>; Zhang, Yifan <Yifan1.Zhang@xxxxxxx> Subject: [PATCH] Revert "drm/amdgpu: don't modify num_doorbells for mes" This reverts commit f46644aa8de6d5efeff8d8c7fbf3ed58a89c765c. THe doorbell index could go beyond the first page for mes queues, this patch breaks the mes self test on gfx11. [ 23.212740] [drm] ring gfx_32768.1.1 was added [ 23.213147] [drm] ring compute_32768.2.2 was added [ 23.213540] [drm] ring sdma_32768.3.3 was added [ 23.213546] [drm:amdgpu_mm_wdoorbell64 [amdgpu]] *ERROR* writing beyond doorbell aperture: 0x00001000! [ 23.214148] amdgpu 0000:c2:00.0: amdgpu: gfx_v11_0_ring_set_wptr_gfx: 5168 0x402 0x1000 100 [ 23.560357] amdgpu 0000:c2:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring gfx_32768.1.1 test failed (-110) Signed-off-by: Yifan Zhang <yifan1.zhang@xxxxxxx> --- .../gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c | 34 +++++++++++-------- 1 file changed, 19 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c index 5c0d3cea817d..31db526d4921 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_doorbell_mgr.c @@ -140,21 +140,25 @@ int amdgpu_doorbell_init(struct amdgpu_device *adev) adev->doorbell.base = pci_resource_start(adev->pdev, 2); adev->doorbell.size = pci_resource_len(adev->pdev, 2); - adev->doorbell.num_kernel_doorbells = - min_t(u32, adev->doorbell.size / sizeof(u32), - adev->doorbell_index.max_assignment + 1); - if (adev->doorbell.num_kernel_doorbells == 0) - return -EINVAL; - - /* - * For Vega, reserve and map two pages on doorbell BAR since SDMA - * paging queue doorbell use the second page. The - * AMDGPU_DOORBELL64_MAX_ASSIGNMENT definition assumes all the - * doorbells are in the first page. So with paging queue enabled, - * the max num_kernel_doorbells should + 1 page (0x400 in dword) - */ - if (adev->asic_type >= CHIP_VEGA10) - adev->doorbell.num_kernel_doorbells += 0x400; + if (adev->enable_mes) { + adev->doorbell.num_kernel_doorbells = + adev->doorbell.size / sizeof(u32); + } else { + adev->doorbell.num_kernel_doorbells = + min_t(u32, adev->doorbell.size / sizeof(u32), + adev->doorbell_index.max_assignment+1); + if (adev->doorbell.num_kernel_doorbells == 0) + return -EINVAL; + + /* For Vega, reserve and map two pages on doorbell BAR since SDMA + * paging queue doorbell use the second page. The + * AMDGPU_DOORBELL64_MAX_ASSIGNMENT definition assumes all the + * doorbells are in the first page. So with paging queue enabled, + * the max num_kernel_doorbells should + 1 page (0x400 in dword) + */ + if (adev->asic_type >= CHIP_VEGA10) + adev->doorbell.num_kernel_doorbells += 0x400; + } adev->doorbell.ptr = ioremap(adev->doorbell.base, adev->doorbell.num_kernel_doorbells * -- 2.37.3