Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit

"Wang, Yang(Kevin)" <KevinYang.Wang@xxxxxxx> · Thu, 21 Apr 2022 02:15:09 +0000

[AMD Official Use Only]

From: Kuehling, Felix <Felix.Kuehling@xxxxxxx>

Sent: Thursday, April 21, 2022 5:21 AM

To: Lazar, Lijo <Lijo.Lazar@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>; Wang, Yang(Kevin) <KevinYang.Wang@xxxxxxx>; Christian König <ckoenig.leichtzumerken@xxxxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx <dri-devel@xxxxxxxxxxxxxxxxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
 <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>

Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit

On 2022-04-20 09:23, Lazar, Lijo wrote:

>

>

> On 4/20/2022 6:26 PM, Christian König wrote:

>> Am 20.04.22 um 14:54 schrieb Wang, Yang(Kevin):

>>>

>>> [AMD Official Use Only]

>>>

>>>

>>> Hi Chris,

>>>

>>> 1) Change the test case to use something larger than 1TiB.

>>> sure, we can increase the size of BO and make test pass,

>>> but if user really want to allocate 1TB GTT BO, we have no reason to 

>>> let it fail? right?

>>

>> No, the reason is the underlying core kernel doesn't allow kvmalloc 

>> allocations with GFP_ZERO which are large enough to hold the array of 

>> allocated pages for this.

>>

>> We are working on top of the core Linux kernel and should *NEVER* 

>> ever add workarounds like what was suggested here. >

>

> AFAIU, for the purpose of ttm use, fallback to vmalloc is fine.

>

>  * Please note that any use of gfp flags outside of GFP_KERNEL is 

> careful to not

>  * fall back to vmalloc.

>  *

That's weird, because kvcalloc does the same thing. If that were not 

able to fall back to vmalloc, it would be pretty useless.

    static inline __alloc_size(1, 2) void *kvcalloc(size_t n, size_t size, gfp_t flags)

    {

             return kvmalloc_array(n, size, flags | __GFP_ZERO);

    }

Maybe kvcalloc is the function we TTM should be using here anyway, 

instead of open-coding the kvmalloc_array call with an extra GFP flag.

Regards,

   Felix

Yes, I agree with your point, and in amdkfd driver code, we have the same risk in svm_range_dma_map_dev().

Best Regards,

Kevin

>

> Actually the current implementation documents the behavior, but it is 

> deep inside the implementation to be noticeable - at least not obvious 

> while using kvmalloc_array.

>

> Thanks,

> Lijo

>

>> Regards,

>> Christian.

>>

>>> the system availed memory about 2T, but it will still fail.

>>>

>>> 2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc 

>>> fallback path.

>>>     the 5.18 kernel will add this patch to fix this issue .

>>>

>>> Best Regards,

>>> Kevin

>>> ------------------------------------------------------------------------ 

>>>

>>> *From:* Koenig, Christian <Christian.Koenig@xxxxxxx>

>>> *Sent:* Wednesday, April 20, 2022 8:42 PM

>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@xxxxxxx>; Christian König 

>>> <ckoenig.leichtzumerken@xxxxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx 

>>> <dri-devel@xxxxxxxxxxxxxxxxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx 

>>> <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>

>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size 

>>> exceeds kmalloc limit

>>> Hi Kevin,

>>>

>>> yes and that is perfectly valid and expected behavior. There is 

>>> absolutely no need to change anything in TTM here.

>>>

>>> What we could do is:

>>> 1) Change the test case to use something larger than 1TiB.

>>> 2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc 

>>> fallback path.

>>>

>>> Regards,

>>> Christian.

>>>

>>> Am 20.04.22 um 14:39 schrieb Wang, Yang(Kevin):

>>>>

>>>> [AMD Official Use Only]

>>>>

>>>>

>>>> Hi Chirs,

>>>>

>>>> yes, right, the amdgpu drive rwill use amdgpu_bo_validate_size() 

>>>> function to verify bo size,

>>>> but when driver try to allocate VRAM domain bo fail, the amdgpu 

>>>> driver will fall back to allocate domain = (GTT | VRAM)  bo.

>>>> please check following code, it will cause the 2nd time to allocate 

>>>> bo fail during allocate 256Mb buffer to store dma address (via 

>>>> kvmalloc()).

>>>>

>>>> initial_domain = (u32)(0xffffffff & args->in.domains);

>>>> retry:

>>>>         r = amdgpu_gem_object_create(adev, size, args->in.alignment,

>>>>                    initial_domain,

>>>>                    flags, ttm_bo_type_device, resv, &gobj);

>>>>         if (r && r != -ERESTARTSYS) {

>>>>                 if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {

>>>>       flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;

>>>>       goto retry;

>>>>                 }

>>>>

>>>>                 if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {

>>>>       initial_domain |= AMDGPU_GEM_DOMAIN_GTT;

>>>>       goto retry;

>>>>                 }

>>>> DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n",

>>>>               size, initial_domain, args->in.alignment, r);

>>>>         }

>>>>

>>>> Best Regards,

>>>> Kevin

>>>>

>>>> ------------------------------------------------------------------------ 

>>>>

>>>> *From:* Christian König <ckoenig.leichtzumerken@xxxxxxxxx> 

>>>> <mailto:ckoenig.leichtzumerken@xxxxxxxxx>

>>>> *Sent:* Wednesday, April 20, 2022 7:55 PM

>>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@xxxxxxx> 

>>>> <mailto:KevinYang.Wang@xxxxxxx>; Koenig, Christian

>>>> <Christian.Koenig@xxxxxxx> <mailto:Christian.Koenig@xxxxxxx>;

>>>> dri-devel@xxxxxxxxxxxxxxxxxxxxx 

>>>> <mailto:dri-devel@xxxxxxxxxxxxxxxxxxxxx>

>>>> <dri-devel@xxxxxxxxxxxxxxxxxxxxx> 

>>>> <mailto:dri-devel@xxxxxxxxxxxxxxxxxxxxx>;

>>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx 

>>>> <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx>

>>>> <amd-gfx@xxxxxxxxxxxxxxxxxxxxx> <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx>

>>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size 

>>>> exceeds kmalloc limit

>>>> Hi Kevin,

>>>>

>>>> no, the test case should already fail in amdgpu_bo_validate_size().

>>>>

>>>> If we have a system with 2TiB of memory where the test case could 

>>>> succeed then we should increase the requested size to something 

>>>> larger.

>>>>

>>>> And if the underlying core Linux kernel functions don't allow 

>>>> allocations as large as the requested one we should *NEVER* ever 

>>>> add workarounds like this.

>>>>

>>>> It is perfectly expected that this test case is not able to fulfill 

>>>> the desired allocation. That it fails during kvmalloc is 

>>>> unfortunate, but not a show stopper.

>>>>

>>>> Regards,

>>>> Christian.

>>>>

>>>>

>>>> Am 20.04.22 um 13:32 schrieb Wang, Yang(Kevin):

>>>>>

>>>>> [AMD Official Use Only]

>>>>>

>>>>>

>>>>> Hi Chris,

>>>>>

>>>>> you misunderstood background about this case.

>>>>>

>>>>> although we expect this test case to fail, it should fail at the 

>>>>> location where the Bo actual memory is actually allocated. now the 

>>>>> code logic will cause the failure to allocate memory to store DMA 

>>>>> address.

>>>>>

>>>>> e.g: the case is failed in 2TB system ram machine, it should be 

>>>>> allocated successful, but it is failed.

>>>>>

>>>>> allocate 1TB BO, the ttm should allocate 1TB/4k * 8 buffer to 

>>>>> store allocate result (page address), this should not be failed 

>>>>> usually.

>>>>>

>>>>> There is a similar fix in upstream kernel 5.18, before this fix 

>>>>> entered the TTM code, this problem existed in TTM.

>>>>>

>>>>> kernel/git/torvalds/linux.git - Linux kernel source tree 

>>>>> <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.18-rc3&id=a421ef303008b0ceee2cfc625c3246fa7654b0ca

>>>>> mm: allow !GFP_KERNEL allocations for kvmalloc

>>>>>

>>>>> Best Regards,

>>>>> Kevin

>>>>>

>>>>> ------------------------------------------------------------------------ 

>>>>>

>>>>> *From:* Koenig, Christian <Christian.Koenig@xxxxxxx> 

>>>>> <mailto:Christian.Koenig@xxxxxxx>

>>>>> *Sent:* Wednesday, April 20, 2022 6:53 PM

>>>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@xxxxxxx> 

>>>>> <mailto:KevinYang.Wang@xxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx

>>>>> <mailto:dri-devel@xxxxxxxxxxxxxxxxxxxxx>

>>>>> <dri-devel@xxxxxxxxxxxxxxxxxxxxx> 

>>>>> <mailto:dri-devel@xxxxxxxxxxxxxxxxxxxxx>;

>>>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx 

>>>>> <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx>

>>>>> <amd-gfx@xxxxxxxxxxxxxxxxxxxxx> 

>>>>> <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx>

>>>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size 

>>>>> exceeds kmalloc limit

>>>>> Am 20.04.22 um 11:07 schrieb Wang, Yang(Kevin):

>>>>>>

>>>>>> [AMD Official Use Only]

>>>>>>

>>>>>>

>>>>>>

>>>>>> ------------------------------------------------------------------------ 

>>>>>>

>>>>>> *From:* Koenig, Christian <Christian.Koenig@xxxxxxx> 

>>>>>> <mailto:Christian.Koenig@xxxxxxx>

>>>>>> *Sent:* Wednesday, April 20, 2022 5:00 PM

>>>>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@xxxxxxx> 

>>>>>> <mailto:KevinYang.Wang@xxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx

>>>>>> <mailto:dri-devel@xxxxxxxxxxxxxxxxxxxxx>

>>>>>> <dri-devel@xxxxxxxxxxxxxxxxxxxxx> 

>>>>>> <mailto:dri-devel@xxxxxxxxxxxxxxxxxxxxx>;

>>>>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx 

>>>>>> <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx>

>>>>>> <amd-gfx@xxxxxxxxxxxxxxxxxxxxx> 

>>>>>> <mailto:amd-gfx@xxxxxxxxxxxxxxxxxxxxx>

>>>>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size 

>>>>>> exceeds kmalloc limit

>>>>>> Am 20.04.22 um 10:56 schrieb Yang Wang:

>>>>>> > if the __GFP_ZERO is set, the kvmalloc() can't fallback to use 

>>>>>> vmalloc()

>>>>>>

>>>>>> Hui what? Why should kvmalloc() not be able to fallback to vmalloc()

>>>>>> when __GFP_ZERO is set?

>>>>>>

>>>>>> And even that is really the case then that sounds like a bug in 

>>>>>> kvmalloc().

>>>>>>

>>>>>> Regards,

>>>>>> Christian.

>>>>>>

>>>>>> [kevin]:

>>>>>> it is really test case from libdrm amdgpu test, which try to 

>>>>>> allocate a big BO which will cause ttm tt init fail.

>>>>>

>>>>>

>>>>> LOL! Guys, this test case is intended to fail!

>>>>> *

>>>>> *The test consists of allocating a buffer so ridiculous large that 

>>>>> it should never succeed and be rejected by the kernel driver.

>>>>>

>>>>> This patch here is a really clear NAK.

>>>>>

>>>>> Regards,

>>>>> Christian.

>>>>>

>>>>>> it may be a kvmalloc() bug, and this patch can as a workaround 

>>>>>> in ttm before this issue fix.

>>>>>>

>>>>>> void *kvmalloc_node(size_t size, gfp_t flags, int node)

>>>>>> {

>>>>>> ...

>>>>>>       if ((flags & GFP_KERNEL) != GFP_KERNEL)

>>>>>>               return kmalloc_node(size, flags, node);

>>>>>> ...

>>>>>> }

>>>>>>

>>>>>> Best Regards,

>>>>>> Kevin

>>>>>>

>>>>>> > to allocate memory, when request size is exceeds kmalloc limit, 

>>>>>> it will

>>>>>> > cause allocate memory fail.

>>>>>> >

>>>>>> > e.g: when ttm want to create a BO with 1TB size, it maybe fail.

>>>>>> >

>>>>>> > Signed-off-by: Yang Wang <KevinYang.Wang@xxxxxxx> 

>>>>>> <mailto:KevinYang.Wang@xxxxxxx>

>>>>>> > ---

>>>>>> >   drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---

>>>>>> >   1 file changed, 11 insertions(+), 3 deletions(-)

>>>>>> >

>>>>>> > diff --git a/drivers/gpu/drm/ttm/ttm_tt.c 

>>>>>> b/drivers/gpu/drm/ttm/ttm_tt.c

>>>>>> > index 79c870a3bef8..9f2f3e576b8d 100644

>>>>>> > --- a/drivers/gpu/drm/ttm/ttm_tt.c

>>>>>> > +++ b/drivers/gpu/drm/ttm/ttm_tt.c

>>>>>> > @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object 

>>>>>> *bo, bool zero_alloc)

>>>>>> >   static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)

>>>>>> >   {

>>>>>> >        ttm->pages = kvmalloc_array(ttm->num_pages, sizeof(void*),

>>>>>> > -                     GFP_KERNEL | __GFP_ZERO);

>>>>>> > + GFP_KERNEL);

>>>>>> >        if (!ttm->pages)

>>>>>> >                return -ENOMEM;

>>>>>> > +

>>>>>> > +     memset(ttm->pages, 0, ttm->num_pages * sizeof(void *));

>>>>>> > +

>>>>>> >        return 0;

>>>>>> >   }

>>>>>> >

>>>>>> > @@ -108,10 +111,12 @@ static int 

>>>>>> ttm_dma_tt_alloc_page_directory(struct ttm_tt *ttm)

>>>>>> >        ttm->pages = kvmalloc_array(ttm->num_pages,

>>>>>> > sizeof(*ttm->pages) +

>>>>>> > sizeof(*ttm->dma_address),

>>>>>> > - GFP_KERNEL | __GFP_ZERO);

>>>>>> > + GFP_KERNEL);

>>>>>> >        if (!ttm->pages)

>>>>>> >                return -ENOMEM;

>>>>>> >

>>>>>> > +     memset(ttm->pages, 0, ttm->num_pages * 

>>>>>> (sizeof(*ttm->pages) + sizeof(*ttm->dma_address)));

>>>>>> > +

>>>>>> >        ttm->dma_address = (void *)(ttm->pages + ttm->num_pages);

>>>>>> >        return 0;

>>>>>> >   }

>>>>>> > @@ -120,9 +125,12 @@ static int 

>>>>>> ttm_sg_tt_alloc_page_directory(struct ttm_tt *ttm)

>>>>>> >   {

>>>>>> >        ttm->dma_address = kvmalloc_array(ttm->num_pages,

>>>>>> > sizeof(*ttm->dma_address),

>>>>>> > - GFP_KERNEL | __GFP_ZERO);

>>>>>> > + GFP_KERNEL);

>>>>>> >        if (!ttm->dma_address)

>>>>>> >                return -ENOMEM;

>>>>>> > +

>>>>>> > +     memset(ttm->dma_address, 0, ttm->num_pages * 

>>>>>> sizeof(*ttm->dma_address));

>>>>>> > +

>>>>>> >        return 0;

>>>>>> >   }

>>>>>> >

>>>>>>

>>>>>

>>>>

>>>

>>