On Mon, Feb 26, 2018 at 12:18 AM, Monk Liu <Monk.Liu at amd.com> wrote: > issue: > sometime GFX/MM ib test hit timeout under SRIOV env, root cause > is that engine doesn't come back soon enough so the current > IB test considered as timed out. > > fix: > for SRIOV GFX IB test wait time need to be expanded a lot during > SRIOV runtimei mode since it couldn't really begin before GFX engine > come back. > > for SRIOV MM IB test it always need more time since MM scheduling > is not go together with GFX engine, it is controled by h/w MM > scheduler so no matter runtime or exclusive mode MM IB test > always need more time. > > Change-Id: I0342371bc073656476ad850e1f5d9a021846dc8c > Signed-off-by: Monk Liu <Monk.Liu at amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 30 +++++++++++++++++++++++++++++- > 1 file changed, 29 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c > index 4709d13..d6776286 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c > @@ -316,14 +316,42 @@ int amdgpu_ib_ring_tests(struct amdgpu_device *adev) > { > unsigned i; > int r, ret = 0; > + long tmo_gfx, tmo_mm; > + > + tmo_mm = tmo_gfx = AMDGPU_IB_TEST_TIMEOUT; > + if (amdgpu_sriov_vf(adev)) { > + /* for MM engines in hypervisor side they are not scheduled together > + * with CP and SDMA engines, so even in exclusive mode MM engine could > + * still running on other VF thus the IB TEST TIMEOUT for MM engines > + * under SR-IOV should be set to a long time. > + */ > + tmo_mm = 8 * AMDGPU_IB_TEST_TIMEOUT; /* 8 sec should be enough for the MM comes back to this VF */ > + } > + > + if (amdgpu_sriov_runtime(adev)) { > + /* for CP & SDMA engines since they are scheduled together so > + * need to make the timeout width enough to cover the time > + * cost waiting for it coming back under RUNTIME only > + */ > + tmo_gfx = 8 * AMDGPU_IB_TEST_TIMEOUT; > + } > + > + adev->accel_working = true; This change seems unrelated. > > for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { > struct amdgpu_ring *ring = adev->rings[i]; > + long tmo; > > if (!ring || !ring->ready) > continue; > > - r = amdgpu_ring_test_ib(ring, AMDGPU_IB_TEST_TIMEOUT); > + /* MM engine need more time */ > + if (ring->idx > 11) Please check ring type here rather than the idx since the idx may vary based on the number of IPs on the SOC. Alex > + tmo = tmo_mm; > + else > + tmo = tmo_gfx; > + > + r = amdgpu_ring_test_ib(ring, tmo); > if (r) { > ring->ready = false; > > -- > 2.7.4 > > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx