[PATCH 05/12] drm/amdgpu: Send no-retry XNACK for all fault types

John.Bridgman@xxxxxxx (Bridgman, John) · Wed, 12 Jul 2017 16:15:13 +0000

>-----Original Message-----
>From: amd-gfx [mailto:amd-gfx-bounces at lists.freedesktop.org] On Behalf
>Of Alex Deucher
>Sent: Wednesday, July 12, 2017 11:59 AM
>To: Kuehling, Felix
>Cc: amd-gfx list
>Subject: Re: [PATCH 05/12] drm/amdgpu: Send no-retry XNACK for all fault
>types
>
>On Wed, Jul 12, 2017 at 1:40 AM, Felix Kuehling <felix.kuehling at amd.com>
>wrote:
>> Any comments?
>>
>> I believe this is a nice stability improvement. In case of VM faults
>> they don't take down the whole GPU with an interrupt storm. With KFD
>> we can recover without a GPU reset in many cases just by unmapping the
>> offending process' queues.
>
>Will this cause any problems with enabling recoverable page faults later?  If
>not,
>Acked-by: Alex Deucher <alexander.deucher at amd.com>

We will need to back this out in order to enable recoverable page faults later, but probably still worth doing in the short term IMO.

>
>>
>> Regards,
>>   Felix
>>
>>
>> On 17-07-03 05:11 PM, Felix Kuehling wrote:
>>> From: Jay Cornwall <Jay.Cornwall at amd.com>
>>>
>>> A subset of VM fault types currently send retry XNACK to the client.
>>> This causes a storm of interrupts from the VM to the host.
>>>
>>> Until the storm is throttled by other means send no-retry XNACK for
>>> all fault types instead. No change in behavior to the client which
>>> will stall indefinitely with the current configuration in any case.
>>> Improves system stability under GC or MMHUB faults.
>>>
>>> Signed-off-by: Jay Cornwall <Jay.Cornwall at amd.com>
>>> Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
>>> ---
>>>  drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c | 3 +++
>>> drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c  | 3 +++
>>>  2 files changed, 6 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
>>> index a42f483..f957b18 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
>>> @@ -206,6 +206,9 @@ static void gfxhub_v1_0_setup_vmid_config(struct
>amdgpu_device *adev)
>>>               tmp = REG_SET_FIELD(tmp, VM_CONTEXT1_CNTL,
>>>                               PAGE_TABLE_BLOCK_SIZE,
>>>                               adev->vm_manager.block_size - 9);
>>> +             /* Send no-retry XNACK on fault to suppress VM fault storm. */
>>> +             tmp = REG_SET_FIELD(tmp, VM_CONTEXT1_CNTL,
>>> +
>>> + RETRY_PERMISSION_OR_INVALID_PAGE_FAULT, 0);
>>>               WREG32_SOC15_OFFSET(GC, 0, mmVM_CONTEXT1_CNTL, i, tmp);
>>>               WREG32_SOC15_OFFSET(GC, 0,
>mmVM_CONTEXT1_PAGE_TABLE_START_ADDR_LO32, i*2, 0);
>>>               WREG32_SOC15_OFFSET(GC, 0,
>>> mmVM_CONTEXT1_PAGE_TABLE_START_ADDR_HI32, i*2, 0); diff --git
>>> a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
>>> index 01918dc..b760018 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
>>> @@ -222,6 +222,9 @@ static void mmhub_v1_0_setup_vmid_config(struct
>amdgpu_device *adev)
>>>               tmp = REG_SET_FIELD(tmp, VM_CONTEXT1_CNTL,
>>>                               PAGE_TABLE_BLOCK_SIZE,
>>>                               adev->vm_manager.block_size - 9);
>>> +             /* Send no-retry XNACK on fault to suppress VM fault storm. */
>>> +             tmp = REG_SET_FIELD(tmp, VM_CONTEXT1_CNTL,
>>> +
>>> + RETRY_PERMISSION_OR_INVALID_PAGE_FAULT, 0);
>>>               WREG32_SOC15_OFFSET(MMHUB, 0, mmVM_CONTEXT1_CNTL, i,
>tmp);
>>>               WREG32_SOC15_OFFSET(MMHUB, 0,
>mmVM_CONTEXT1_PAGE_TABLE_START_ADDR_LO32, i*2, 0);
>>>               WREG32_SOC15_OFFSET(MMHUB, 0,
>>> mmVM_CONTEXT1_PAGE_TABLE_START_ADDR_HI32, i*2, 0);
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>_______________________________________________
>amd-gfx mailing list
>amd-gfx at lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/amd-gfx