The interrupt after FLR is missed sometimes due to hardware reason, so guest driver get the notification of FLR completion via polling message. Then host doesn't write VALID bit to avoid sending interrupt, otherwise the completion will be handled twice. So there's a valid message without VALID bit for FLR completion, driver should handle it without checking. Signed-off-by: Pixel Ding <Pixel.Ding at amd.com> --- drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c index 3164d61..3c6b0af 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c @@ -368,9 +368,12 @@ static int xgpu_vi_mailbox_rcv_msg(struct amdgpu_device *adev, u32 reg; u32 mask = REG_FIELD_MASK(MAILBOX_CONTROL, RCV_MSG_VALID); - reg = RREG32_NO_KIQ(mmMAILBOX_CONTROL); - if (!(reg & mask)) - return -ENOENT; + /* workaround: host driver doesn't set VALID for CMPL now */ + if (event != IDH_FLR_NOTIFICATION_CMPL) { + reg = RREG32_NO_KIQ(mmMAILBOX_CONTROL); + if (!(reg & mask)) + return -ENOENT; + } reg = RREG32_NO_KIQ(mmMAILBOX_MSGBUF_RCV_DW0); if (reg != event) -- 2.7.4