On Fri, May 17, 2024 at 2:35 AM Christian König <christian.koenig@xxxxxxx> wrote: > > Am 16.05.24 um 19:57 schrieb Tim Van Patten: > > From: Tim Van Patten <timvp@xxxxxxxxxx> > > > > The following commit updated gmc->noretry from 0 to 1 for GC HW IP > > 9.3.0: > > > > commit 5f3854f1f4e2 ("drm/amdgpu: add more cases to noretry=1") > > > > This causes the device to hang when a page fault occurs, until the > > device is rebooted. Instead, revert back to gmc->noretry=0 so the device > > is still responsive. > > Wait a second. Why does the device hang on a page fault? That shouldn't > happen independent of noretry. > > So that strongly sounds like this is just hiding a bug elsewhere. Fair enough, but this is also the only gfx9 APU which defaults to noretry=1, all of the rest are dGPUs. I'd argue it should align with the other GFX9 APUs or they should all enable noretry=1. Alex > > Regards, > Christian. > > > > > Fixes: 5f3854f1f4e2 ("drm/amdgpu: add more cases to noretry=1") > > Signed-off-by: Tim Van Patten <timvp@xxxxxxxxxx> > > --- > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 1 - > > 1 file changed, 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > > index be4629cdac049..bff54a20835f1 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > > @@ -876,7 +876,6 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev) > > struct amdgpu_gmc *gmc = &adev->gmc; > > uint32_t gc_ver = amdgpu_ip_version(adev, GC_HWIP, 0); > > bool noretry_default = (gc_ver == IP_VERSION(9, 0, 1) || > > - gc_ver == IP_VERSION(9, 3, 0) || > > gc_ver == IP_VERSION(9, 4, 0) || > > gc_ver == IP_VERSION(9, 4, 1) || > > gc_ver == IP_VERSION(9, 4, 2) || >