RE: [PATCH 2/2] drm/amdgpu: Use new TTM flag to avoid OOM triggering.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




-----Original Message-----
From: Christian König [mailto:ckoenig.leichtzumerken@xxxxxxxxx] 
Sent: Tuesday, January 16, 2018 8:46 PM
To: Grodzovsky, Andrey <Andrey.Grodzovsky@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>; He, Roger <Hongbo.He@xxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Subject: Re: [PATCH 2/2] drm/amdgpu: Use new TTM flag to avoid OOM triggering.

Am 16.01.2018 um 13:43 schrieb Andrey Grodzovsky:
>
>
> On 01/16/2018 03:54 AM, Christian König wrote:
>> Am 16.01.2018 um 07:18 schrieb He, Roger:
>>> -----Original Message-----
>>> From: Andrey Grodzovsky [mailto:andrey.grodzovsky@xxxxxxx]
>>> Sent: Saturday, January 13, 2018 6:29 AM
>>> To: dri-devel@xxxxxxxxxxxxxxxxxxxxx; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
>>> Cc: Koenig, Christian <Christian.Koenig@xxxxxxx>; He, Roger 
>>> <Hongbo.He@xxxxxxx>; Grodzovsky, Andrey <Andrey.Grodzovsky@xxxxxxx>
>>> Subject: [PATCH 2/2] drm/amdgpu: Use new TTM flag to avoid OOM 
>>> triggering.
>>>
>>> This to have a load time option to avoid OOM on RAM allocations.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        | 1 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 4 ++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 ++++
>>>   3 files changed, 9 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index b7c181e..1387239 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -127,6 +127,7 @@ extern int amdgpu_job_hang_limit;  extern int 
>>> amdgpu_lbpw;  extern int amdgpu_compute_multipipe;  extern int 
>>> amdgpu_gpu_recovery;
>>> +extern int amdgpu_alloc_no_oom;
>>>     #ifdef CONFIG_DRM_AMDGPU_SI
>>>   extern int amdgpu_si_support;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> index d96f9ac..6e98189 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> @@ -130,6 +130,7 @@ int amdgpu_job_hang_limit = 0;  int amdgpu_lbpw 
>>> = -1;  int amdgpu_compute_multipipe = -1;  int amdgpu_gpu_recovery = 
>>> -1; /* auto */
>>> +int amdgpu_alloc_no_oom = -1; /* auto */
>>>
>>> How about turn it on as default?
>>
>> I think we can even go a step further, drop the module parameter and 
>> just turn it always on for amdgpu.
>>
>> Christian.
>
> Will fix, just a reminder that Roger's patches - [PATCH 1/2] drm/ttm: 
> don't update global memory count for some special cases [PATCH 2/2] 
> drm/ttm: only free pages rather than update global memory count 
> together
>
> Needs to be merged before my patches since the fix a TTM bug on 
> allocation failure.

	The second is merged, but I had some comments on the first and Roger hasn't replied yet.

	Roger what's the status on that one?

Already fixed locally, but not tested yet.  Try to send out today.

Thanks
Roger(Hongbo.He)

>
> Thanks,
> Andrey
>
>>
>>>
>>> Thanks
>>> Roger(Hongbo.He)
>>>
>>> MODULE_PARM_DESC(vramlimit, "Restrict VRAM for testing, in 
>>> megabytes");  module_param_named(vramlimit, amdgpu_vram_limit, int, 
>>> 0600); @@ -285,6 +286,9 @@ module_param_named(compute_multipipe,
>>> amdgpu_compute_multipipe, int, 0444); MODULE_PARM_DESC(gpu_recovery, 
>>> "Enable GPU recovery mechanism, (1 = enable, 0 = disable, -1 = 
>>> auto"); module_param_named(gpu_recovery, amdgpu_gpu_recovery, int, 
>>> 0444);
>>>   +MODULE_PARM_DESC(alloc_no_oom, "Allocate RAM without triggering 
>>> OOM
>>> +killer, (1 = enable, 0 = disable, -1 = auto"); 
>>> +module_param_named(alloc_no_oom, amdgpu_alloc_no_oom, int, 0444);
>>> +
>>>   #ifdef CONFIG_DRM_AMDGPU_SI
>>>     #if defined(CONFIG_DRM_RADEON) ||
>>> defined(CONFIG_DRM_RADEON_MODULE) diff --git 
>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index 5c4c3e0..fc27164 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -420,6 +420,10 @@ static int amdgpu_bo_do_create(struct 
>>> amdgpu_device *adev,  #endif
>>>         bo->tbo.bdev = &adev->mman.bdev;
>>> +
>>> +    if (amdgpu_alloc_no_oom == 1)
>>> +        bo->tbo.bdev->no_retry = true;
>>> +
>>>       amdgpu_ttm_placement_from_domain(bo, domain);
>>>         r = ttm_bo_init_reserved(&adev->mman.bdev, &bo->tbo, size, 
>>> type,
>>> --
>>> 2.7.4
>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel@xxxxxxxxxxxxxxxxxxxxx
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux