On 2017-08-11 08:54 PM, StDenis, Tom wrote: > Hi Felix, > > Well it's really up to Christian and Alex but I'd keep an eye on this since it'll cause issues with embedded down the road. > > I happen to have a CZ system so I could possibly try and bisect 4.11/4.12 and see if there's any stable points for you guys. I doubt there is a stable point. On the KFD branch we've always had GFX power gating disabled, because it was causing us problems as soon as we picked up kernel 4.6 in August 2016, which first introduced CZ power gating to the KFD branch. > Is there a short and simple KFD setup I can install/run to test it? Or is simply loading a KFD merged/rebased kernel enough to cause the hang (and thus I guess a bisect doesn't make sense). With patch 19 in this series, it's a hang during boot. Without it, you can boot, and you'll get errors from kfdtest due to MEC hangs as soon as a user mode queue is created. You'd need a modified Thunk and KFDTest for this experiment. You could get both from a recent roc-master build. The rest of the ROCm stack isn't needed. KFDTest isn't released to the public, and the last public release doesn't include the necessary Thunk changes yet. I think the Thunk change will make it into ROCm 1.6.3. I've also been able to run hsaconformance (which I think is included in our public releases) with 74% of tests passing. OCL tests currently segfault in the HSA runtime, as do some of the conformance tests. I'm going to look into the HSA runtime a bit more to see if I can get OCL to work for more realistic testing. Regards, Felix > > Cheers, > Tom > > ________________________________________ > From: Kuehling, Felix > Sent: Friday, August 11, 2017 20:40 > To: StDenis, Tom; amd-gfx at lists.freedesktop.org; oded.gabbay at gmail.com > Subject: Re: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ > > With the next change that adds programming of RLC_CP_SCHEDULERS it's a > VM fault and hard hang during boot, just after HWS initialization. > Without that change it's only a MEC hang when the first application > tries to create a user mode queue. > > Regards, > Felix > > On 2017-08-11 08:08 PM, StDenis, Tom wrote: >> Hmm, I'd still be careful about disabling GFX PG since we may fail to meet energy star requirements. >> >> Does the system hard hang or simply GPU hang? >> >> Tom >> >> ________________________________________ >> From: Kuehling, Felix >> Sent: Friday, August 11, 2017 19:56 >> To: StDenis, Tom; amd-gfx at lists.freedesktop.org; oded.gabbay at gmail.com >> Subject: Re: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ >> >> Yes, I'm up-to-date. KFD doesn't use the KIQ to map the HIQ. And HIQ >> maps all our other queues (unless we're disabling the hardware scheduler). >> >> Regards, >> Felix >> >> >> On 2017-08-11 07:45 PM, StDenis, Tom wrote: >>> Hi Felix, >>> >>> I'm assuming your tree is up to date with amd-staging-4.11 or 4.12 but we did previously have issues with compute rings if PG was enabled (specifically CGCG + PG) on Carrizo. Then David committed some KIQ upgrades and it started working properly. >>> >>> Could that be related? Because GFX PG "should work" on Carrizo is the official line last I heard from the GFX IP team. >>> >>> Cheers, >>> Tom >>> ________________________________________ >>> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> on behalf of Felix Kuehling <Felix.Kuehling at amd.com> >>> Sent: Friday, August 11, 2017 17:56 >>> To: amd-gfx at lists.freedesktop.org; oded.gabbay at gmail.com >>> Cc: Kuehling, Felix >>> Subject: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ >>> >>> It's causing problems with user mode queues and the HIQ, and can >>> lead to hard hangs during boot after programming RLC_CP_SCHEDULERS. >>> >>> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com> >>> --- >>> drivers/gpu/drm/amd/amdgpu/vi.c | 3 +-- >>> 1 file changed, 1 insertion(+), 2 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c >>> index 18bb3cb..495c8a3 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/vi.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/vi.c >>> @@ -1029,8 +1029,7 @@ static int vi_common_early_init(void *handle) >>> /* rev0 hardware requires workarounds to support PG */ >>> adev->pg_flags = 0; >>> if (adev->rev_id != 0x00 || CZ_REV_BRISTOL(adev->pdev->revision)) { >>> - adev->pg_flags |= AMD_PG_SUPPORT_GFX_PG | >>> - AMD_PG_SUPPORT_GFX_SMG | >>> + adev->pg_flags |= AMD_PG_SUPPORT_GFX_SMG | >>> AMD_PG_SUPPORT_GFX_PIPELINE | >>> AMD_PG_SUPPORT_CP | >>> AMD_PG_SUPPORT_UVD | >>> -- >>> 2.7.4 >>> >>> _______________________________________________ >>> amd-gfx mailing list >>> amd-gfx at lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx