[AMD Official Use Only]
From: Kuehling, Felix <Felix.Kuehling@xxxxxxx>
Sent: Thursday, April 21, 2022 5:21 AM
To: Lazar, Lijo <Lijo.Lazar@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>; Wang, Yang(Kevin) <KevinYang.Wang@xxxxxxx>; Christian König <ckoenig.leichtzumerken@xxxxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx <dri-devel@xxxxxxxxxxxxxxxxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit
On 2022-04-20 09:23, Lazar, Lijo wrote:
>
>
> On 4/20/2022 6:26 PM, Christian König wrote:
>> Am 20.04.22 um 14:54 schrieb Wang, Yang(Kevin):
>>>
>>> [AMD Official Use Only]
>>>
>>>
>>> Hi Chris,
>>>
>>> 1) Change the test case to use something larger than 1TiB.
>>> sure, we can increase the size of BO and make test pass,
>>> but if user really want to allocate 1TB GTT BO, we have no reason to
>>> let it fail? right?
>>
>> No, the reason is the underlying core kernel doesn't allow kvmalloc
>> allocations with GFP_ZERO which are large enough to hold the array of
>> allocated pages for this.
>>
>> We are working on top of the core Linux kernel and should *NEVER*
>> ever add workarounds like what was suggested here. >
>
> AFAIU, for the purpose of ttm use, fallback to vmalloc is fine.
>
> * Please note that any use of gfp flags outside of GFP_KERNEL is
> careful to not
> * fall back to vmalloc.
> *
That's weird, because kvcalloc does the same thing. If that were not
able to fall back to vmalloc, it would be pretty useless.
static inline __alloc_size(1, 2) void *kvcalloc(size_t n, size_t size, gfp_t flags)
{
return kvmalloc_array(n, size, flags | __GFP_ZERO);
}
Maybe kvcalloc is the function we TTM should be using here anyway,
instead of open-coding the kvmalloc_array call with an extra GFP flag.
Regards,
Felix
Yes, I agree with your point, and in amdkfd driver code, we have the same risk in svm_range_dma_map_dev().
Yes, sounds like a good idea to me as well to change that.
Regards,
Christian.
Best Regards,
Kevin
>
> Actually the current implementation documents the behavior, but it is
> deep inside the implementation to be noticeable - at least not obvious
> while using kvmalloc_array.
>
> Thanks,
> Lijo
>
>> Regards,
>> Christian.
>>
>>> the system availed memory about 2T, but it will still fail.
>>>
>>> 2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc
>>> fallback path.
>>> the 5.18 kernel will add this patch to fix this issue .
>>>
>>> Best Regards,
>>> Kevin
>>> ------------------------------------------------------------------------
>>>
>>> *From:* Koenig, Christian <Christian.Koenig@xxxxxxx>
>>> *Sent:* Wednesday, April 20, 2022 8:42 PM
>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@xxxxxxx>; Christian König
>>> <ckoenig.leichtzumerken@xxxxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx
>>> <dri-devel@xxxxxxxxxxxxxxxxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
>>> <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>
>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size
>>> exceeds kmalloc limit
>>> Hi Kevin,
>>>
>>> yes and that is perfectly valid and expected behavior. There is
>>> absolutely no need to change anything in TTM here.
>>>
>>> What we could do is:
>>> 1) Change the test case to use something larger than 1TiB.
>>> 2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc
>>> fallback path.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 20.04.22 um 14:39 schrieb Wang, Yang(Kevin):
>>>>
>>>> [AMD Official Use Only]
>>>>
>>>>
>>>> Hi Chirs,
>>>>
>>>> yes, right, the amdgpu drive rwill use amdgpu_bo_validate_size()
>>>> function to verify bo size,
>>>> but when driver try to allocate VRAM domain bo fail, the amdgpu
>>>> driver will fall back to allocate domain = (GTT | VRAM) bo.
>>>> please check following code, it will cause the 2nd time to allocate
>>>> bo fail during allocate 256Mb buffer to store dma address (via
>>>> kvmalloc()).
>>>>
>>>> initial_domain = (u32)(0xffffffff & args->in.domains);
>>>> retry:
>>>> r = amdgpu_gem_object_create(adev, size, args->in.alignment,
>>>> initial_domain,
>>>> flags, ttm_bo_type_device, resv, &gobj);
>>>> if (r && r != -ERESTARTSYS) {
>>>> if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
>>>> flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>>>> goto retry;
>>>> }
>>>>
>>>> if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
>>>> initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
>>>> goto retry;
>>>> }
>>>> DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n",
>>>> size, initial_domain, args->in.alignment, r);
>>>> }
>>>>
>>>> Best Regards,
>>>> Kevin
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>> *From:* Christian König <ckoenig.leichtzumerken@xxxxxxxxx>
>>>> <mailto:ckoenig.leichtzumerken@xxxxxxxxx>
>>>> *Sent:* Wednesday, April 20, 2022 7:55 PM
>>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@xxxxxxx>
>>>> <mailto:KevinYang.Wang@xxxxxxx>; Koenig, Christian
>>>> <Christian.Koenig@xxxxxxx> <mailto:Christian.Koenig@xxxxxxx>;
>>>> dri-devel@xxxxxxxxxxxxxxxxxxxxx
>>>> <mailto:dri-devel@xxxxxxxxxxxxxxxxxxxxx>
>>>> <dri-devel@xxxxxxxxxxxxxxxxxxxxx>
>>>> <mailto:dri-devel@xxxxxxxxxxxxxxxxxxxxx>;
>>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx
>>>> <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx>
>>>> <amd-gfx@xxxxxxxxxxxxxxxxxxxxx> <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx>
>>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size
>>>> exceeds kmalloc limit
>>>> Hi Kevin,
>>>>
>>>> no, the test case should already fail in amdgpu_bo_validate_size().
>>>>
>>>> If we have a system with 2TiB of memory where the test case could
>>>> succeed then we should increase the requested size to something
>>>> larger.
>>>>
>>>> And if the underlying core Linux kernel functions don't allow
>>>> allocations as large as the requested one we should *NEVER* ever
>>>> add workarounds like this.
>>>>
>>>> It is perfectly expected that this test case is not able to fulfill
>>>> the desired allocation. That it fails during kvmalloc is
>>>> unfortunate, but not a show stopper.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>
>>>> Am 20.04.22 um 13:32 schrieb Wang, Yang(Kevin):
>>>>>
>>>>> [AMD Official Use Only]
>>>>>
>>>>>
>>>>> Hi Chris,
>>>>>
>>>>> you misunderstood background about this case.
>>>>>
>>>>> although we expect this test case to fail, it should fail at the
>>>>> location where the Bo actual memory is actually allocated. now the
>>>>> code logic will cause the failure to allocate memory to store DMA
>>>>> address.
>>>>>
>>>>> e.g: the case is failed in 2TB system ram machine, it should be
>>>>> allocated successful, but it is failed.
>>>>>
>>>>> allocate 1TB BO, the ttm should allocate 1TB/4k * 8 buffer to
>>>>> store allocate result (page address), this should not be failed
>>>>> usually.
>>>>>
>>>>> There is a similar fix in upstream kernel 5.18, before this fix
>>>>> entered the TTM code, this problem existed in TTM.
>>>>>
>>>>> kernel/git/torvalds/linux.git - Linux kernel source tree
>>>>> <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.18-rc3&id=a421ef303008b0ceee2cfc625c3246fa7654b0ca
>>>>> mm: allow !GFP_KERNEL allocations for kvmalloc
>>>>>
>>>>> Best Regards,
>>>>> Kevin
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>>
>>>>> *From:* Koenig, Christian <Christian.Koenig@xxxxxxx>
>>>>> <mailto:Christian.Koenig@xxxxxxx>
>>>>> *Sent:* Wednesday, April 20, 2022 6:53 PM
>>>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@xxxxxxx>
>>>>> <mailto:KevinYang.Wang@xxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx
>>>>> <mailto:dri-devel@xxxxxxxxxxxxxxxxxxxxx>
>>>>> <dri-devel@xxxxxxxxxxxxxxxxxxxxx>
>>>>> <mailto:dri-devel@xxxxxxxxxxxxxxxxxxxxx>;
>>>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx
>>>>> <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx>
>>>>> <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>
>>>>> <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx>
>>>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size
>>>>> exceeds kmalloc limit
>>>>> Am 20.04.22 um 11:07 schrieb Wang, Yang(Kevin):
>>>>>>
>>>>>> [AMD Official Use Only]
>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------
>>>>>>
>>>>>> *From:* Koenig, Christian <Christian.Koenig@xxxxxxx>
>>>>>> <mailto:Christian.Koenig@xxxxxxx>
>>>>>> *Sent:* Wednesday, April 20, 2022 5:00 PM
>>>>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@xxxxxxx>
>>>>>> <mailto:KevinYang.Wang@xxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx
>>>>>> <mailto:dri-devel@xxxxxxxxxxxxxxxxxxxxx>
>>>>>> <dri-devel@xxxxxxxxxxxxxxxxxxxxx>
>>>>>> <mailto:dri-devel@xxxxxxxxxxxxxxxxxxxxx>;
>>>>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx
>>>>>> <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx>
>>>>>> <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>
>>>>>> <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx>
>>>>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size
>>>>>> exceeds kmalloc limit
>>>>>> Am 20.04.22 um 10:56 schrieb Yang Wang:
>>>>>> > if the __GFP_ZERO is set, the kvmalloc() can't fallback to use
>>>>>> vmalloc()
>>>>>>
>>>>>> Hui what? Why should kvmalloc() not be able to fallback to vmalloc()
>>>>>> when __GFP_ZERO is set?
>>>>>>
>>>>>> And even that is really the case then that sounds like a bug in
>>>>>> kvmalloc().
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>> [kevin]:
>>>>>> it is really test case from libdrm amdgpu test, which try to
>>>>>> allocate a big BO which will cause ttm tt init fail.
>>>>>
>>>>>
>>>>> LOL! Guys, this test case is intended to fail!
>>>>> *
>>>>> *The test consists of allocating a buffer so ridiculous large that
>>>>> it should never succeed and be rejected by the kernel driver.
>>>>>
>>>>> This patch here is a really clear NAK.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>> it may be a kvmalloc() bug, and this patch can as a workaround
>>>>>> in ttm before this issue fix.
>>>>>>
>>>>>> void *kvmalloc_node(size_t size, gfp_t flags, int node)
>>>>>> {
>>>>>> ...
>>>>>> if ((flags & GFP_KERNEL) != GFP_KERNEL)
>>>>>> return kmalloc_node(size, flags, node);
>>>>>> ...
>>>>>> }
>>>>>>
>>>>>> Best Regards,
>>>>>> Kevin
>>>>>>
>>>>>> > to allocate memory, when request size is exceeds kmalloc limit,
>>>>>> it will
>>>>>> > cause allocate memory fail.
>>>>>> >
>>>>>> > e.g: when ttm want to create a BO with 1TB size, it maybe fail.
>>>>>> >
>>>>>> > Signed-off-by: Yang Wang <KevinYang.Wang@xxxxxxx>
>>>>>> <mailto:KevinYang.Wang@xxxxxxx>
>>>>>> > ---
>>>>>> > drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---
>>>>>> > 1 file changed, 11 insertions(+), 3 deletions(-)
>>>>>> >
>>>>>> > diff --git a/drivers/gpu/drm/ttm/ttm_tt.c
>>>>>> b/drivers/gpu/drm/ttm/ttm_tt.c
>>>>>> > index 79c870a3bef8..9f2f3e576b8d 100644
>>>>>> > --- a/drivers/gpu/drm/ttm/ttm_tt.c
>>>>>> > +++ b/drivers/gpu/drm/ttm/ttm_tt.c
>>>>>> > @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object
>>>>>> *bo, bool zero_alloc)
>>>>>> > static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)
>>>>>> > {
>>>>>> > ttm->pages = kvmalloc_array(ttm->num_pages, sizeof(void*),
>>>>>> > - GFP_KERNEL | __GFP_ZERO);
>>>>>> > + GFP_KERNEL);
>>>>>> > if (!ttm->pages)
>>>>>> > return -ENOMEM;
>>>>>> > +
>>>>>> > + memset(ttm->pages, 0, ttm->num_pages * sizeof(void *));
>>>>>> > +
>>>>>> > return 0;
>>>>>> > }
>>>>>> >
>>>>>> > @@ -108,10 +111,12 @@ static int
>>>>>> ttm_dma_tt_alloc_page_directory(struct ttm_tt *ttm)
>>>>>> > ttm->pages = kvmalloc_array(ttm->num_pages,
>>>>>> > sizeof(*ttm->pages) +
>>>>>> > sizeof(*ttm->dma_address),
>>>>>> > - GFP_KERNEL | __GFP_ZERO);
>>>>>> > + GFP_KERNEL);
>>>>>> > if (!ttm->pages)
>>>>>> > return -ENOMEM;
>>>>>> >
>>>>>> > + memset(ttm->pages, 0, ttm->num_pages *
>>>>>> (sizeof(*ttm->pages) + sizeof(*ttm->dma_address)));
>>>>>> > +
>>>>>> > ttm->dma_address = (void *)(ttm->pages + ttm->num_pages);
>>>>>> > return 0;
>>>>>> > }
>>>>>> > @@ -120,9 +125,12 @@ static int
>>>>>> ttm_sg_tt_alloc_page_directory(struct ttm_tt *ttm)
>>>>>> > {
>>>>>> > ttm->dma_address = kvmalloc_array(ttm->num_pages,
>>>>>> > sizeof(*ttm->dma_address),
>>>>>> > - GFP_KERNEL | __GFP_ZERO);
>>>>>> > + GFP_KERNEL);
>>>>>> > if (!ttm->dma_address)
>>>>>> > return -ENOMEM;
>>>>>> > +
>>>>>> > + memset(ttm->dma_address, 0, ttm->num_pages *
>>>>>> sizeof(*ttm->dma_address));
>>>>>> > +
>>>>>> > return 0;
>>>>>> > }
>>>>>> >
>>>>>>
>>>>>
>>>>
>>>
>>