Not sure whether it is related to this change, when I boot a system with amdgpu blacklisted, then modprobe amdgpu, I followed issue. Failing path is gmc_v9_0_flush_gpu_tlb calling amdgpu_virt_kiq_req_write_reg_wait. If I boot w/o amdgpu blacklisted, then it is fine (load amdgpu directly during boot). [ 94.778231] amdgpu 0000:08:00.0: ring uvd_enc_0.1 uses VM inv eng 6 on hub 1 [ 94.778234] amdgpu 0000:08:00.0: ring vce0 uses VM inv eng 7 on hub 1 [ 94.778236] amdgpu 0000:08:00.0: ring vce1 uses VM inv eng 8 on hub 1 [ 94.778239] amdgpu 0000:08:00.0: ring vce2 uses VM inv eng 9 on hub 1 [ 94.778242] [drm] ECC is not present. [ 94.778245] [drm] SRAM ECC is not present. [ 94.779441] [drm] Initialized amdgpu 3.32.0 20150101 for 0000:08:00.0 on minor 0 [ 96.307042] rfkill: input handler enabled [ 97.101621] amdgpu 0000:08:00.0: [mmhub] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 97.101647] amdgpu 0000:08:00.0: in page starting at address 0x0000000000570000 from 18 [ 97.101653] amdgpu 0000:08:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x0000013C [ 97.101663] amdgpu 0000:08:00.0: [mmhub] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 97.101668] amdgpu 0000:08:00.0: in page starting at address 0x0000000000570000 from 18 [ 97.101673] amdgpu 0000:08:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x0000013D [ 97.101681] amdgpu 0000:08:00.0: [mmhub] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 97.101686] amdgpu 0000:08:00.0: in page starting at address 0x0000000000570000 from 18 [ 97.101691] amdgpu 0000:08:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x0000013C [ 97.101699] amdgpu 0000:08:00.0: [mmhub] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 97.101704] amdgpu 0000:08:00.0: in page starting at address 0x0000000000570000 from 18 [ 97.101708] amdgpu 0000:08:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x0000013C [ 97.101717] amdgpu 0000:08:00.0: [mmhub] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 97.101722] amdgpu 0000:08:00.0: in page starting at address 0x0000000000570000 from 18 [ 97.101726] amdgpu 0000:08:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x0000013C [ 97.101735] amdgpu 0000:08:00.0: [mmhub] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 97.101740] amdgpu 0000:08:00.0: in page starting at address 0x0000000000570000 from 18 [ 97.101744] amdgpu 0000:08:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x0000013C [ 97.101753] amdgpu 0000:08:00.0: [mmhub] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 97.101757] amdgpu 0000:08:00.0: in page starting at address 0x0000000000570000 from 18 [ 97.101762] amdgpu 0000:08:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x0000013C [ 97.101770] amdgpu 0000:08:00.0: [mmhub] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 97.101775] amdgpu 0000:08:00.0: in page starting at address 0x0000000000570000 from 18 [ 97.101779] amdgpu 0000:08:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x0000013C [ 97.101788] amdgpu 0000:08:00.0: [mmhub] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 97.101793] amdgpu 0000:08:00.0: in page starting at address 0x0000000000570000 from 18 [ 97.101797] amdgpu 0000:08:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x0000013C [ 97.101806] amdgpu 0000:08:00.0: [mmhub] no-retry page fault (src_id:0 ring:158 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 97.101810] amdgpu 0000:08:00.0: in page starting at address 0x0000000000570000 from 18 [ 97.101815] amdgpu 0000:08:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x0000013C [ 98.711247] failed to write reg 28b4 wait reg 28c6 [ 100.315395] failed to write reg 1a6f4 wait reg 1a706 [ 101.943593] failed to write reg 28b4 wait reg 28c6 Regards, Oak -----Original Message----- From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Chengming Gui Sent: Friday, November 16, 2018 3:21 AM To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx Cc: Gui, Jack <Jack.Gui@xxxxxxx> Subject: [PATCH revert] Revert "drm/amdgpu: use GMC v9 KIQ workaround only for the GFXHUB" With GFXOFF enabled, this patch will cause PCO amdgpu_test failed, but GFXOFF is necessary for PCO, so revert the patch. This reverts commit b83761bb0b09ec11c924afe9d88e458cb16a0372. Signed-off-by: Jack Gui <Jack.Gui@xxxxxxx> --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 811231e..14ca4d8 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -338,9 +338,9 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, struct amdgpu_vmhub *hub = &adev->vmhub[i]; u32 tmp = gmc_v9_0_get_invalidate_req(vmid, flush_type); - if (i == AMDGPU_GFXHUB && !adev->in_gpu_reset && - adev->gfx.kiq.ring.sched.ready && - (amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev))) { + if (adev->gfx.kiq.ring.sched.ready && + (amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev)) && + !adev->in_gpu_reset) { uint32_t req = hub->vm_inv_eng0_req + eng; uint32_t ack = hub->vm_inv_eng0_ack + eng; -- 2.7.4 _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx