RE: [PATCH V2] drm/amdgpu: Fix incorrect return value

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[AMD Official Use Only - General]

Hi Christian:
   If an ecc error occurs at an address, HW will generate an interrupt to SW to retire all pages located in the same physical row as the error address based on the physical characteristics of the memory device.
   Therefore, if other pages located on the same physical row as the error address also occur ecc errors later, HW will also generate multiple interrupts to SW to retire these same pages again, so that amdgpu_vram_mgr_reserve_range will be called multiple times to reserve the same pages.

    I think it's more appropriate to do the status check inside the function. If the function entry is not checked, people who are not familiar with this part of the code can easily make mistakes when calling the function.


-----------------
Best Regards,
Thomas

-----Original Message-----
From: Christian König <ckoenig.leichtzumerken@xxxxxxxxx>
Sent: Friday, April 12, 2024 5:24 PM
To: Chai, Thomas <YiPeng.Chai@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Chai, Thomas <YiPeng.Chai@xxxxxxx>; Zhang, Hawking <Hawking.Zhang@xxxxxxx>; Zhou1, Tao <Tao.Zhou1@xxxxxxx>; Li, Candice <Candice.Li@xxxxxxx>; Wang, Yang(Kevin) <KevinYang.Wang@xxxxxxx>; Yang, Stanley <Stanley.Yang@xxxxxxx>
Subject: Re: [PATCH V2] drm/amdgpu: Fix incorrect return value

Am 12.04.24 um 10:55 schrieb YiPeng Chai:
> [Why]
>    After calling amdgpu_vram_mgr_reserve_range multiple times with the
> same address, calling amdgpu_vram_mgr_query_page_status will always
> return -EBUSY.
>    From the second call to amdgpu_vram_mgr_reserve_range, the same
> address will be added to the reservations_pending list again and is
> never moved to the reserved_pages list because the address had been
> reserved.

Well just to make it clear that approach is a NAK until my concerns are solved.

Regards,
Christian.

>
> [How]
>    First add the address status check before calling
> amdgpu_vram_mgr_do_reserve, if the address is already reserved, do
> nothing; If the address is already in the reservations_pending list,
> directly reserve memory; only add new nodes for the addresses that are
> not in the reserved_pages list and reservations_pending list.
>
> V2:
>   Avoid repeated locking/unlocking.
>
> Signed-off-by: YiPeng Chai <YiPeng.Chai@xxxxxxx>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 25 +++++++++++++-------
>   1 file changed, 16 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index 1e36c428d254..a636d3f650b1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> @@ -317,7 +317,6 @@ static void amdgpu_vram_mgr_do_reserve(struct
> ttm_resource_manager *man)
>
>               dev_dbg(adev->dev, "Reservation 0x%llx - %lld, Succeeded\n",
>                       rsv->start, rsv->size);
> -
>               vis_usage = amdgpu_vram_mgr_vis_size(adev, block);
>               atomic64_add(vis_usage, &mgr->vis_usage);
>               spin_lock(&man->bdev->lru_lock);
> @@ -340,19 +339,27 @@ int amdgpu_vram_mgr_reserve_range(struct amdgpu_vram_mgr *mgr,
>                                 uint64_t start, uint64_t size)
>   {
>       struct amdgpu_vram_reservation *rsv;
> +     int ret = 0;
>
> -     rsv = kzalloc(sizeof(*rsv), GFP_KERNEL);
> -     if (!rsv)
> -             return -ENOMEM;
> +     ret = amdgpu_vram_mgr_query_page_status(mgr, start);
> +     if (!ret)
> +             return 0;
>
> -     INIT_LIST_HEAD(&rsv->allocated);
> -     INIT_LIST_HEAD(&rsv->blocks);
> +     if (ret == -ENOENT) {
> +             rsv = kzalloc(sizeof(*rsv), GFP_KERNEL);
> +             if (!rsv)
> +                     return -ENOMEM;
>
> -     rsv->start = start;
> -     rsv->size = size;
> +             INIT_LIST_HEAD(&rsv->allocated);
> +             INIT_LIST_HEAD(&rsv->blocks);
> +
> +             rsv->start = start;
> +             rsv->size = size;
> +     }
>
>       mutex_lock(&mgr->lock);
> -     list_add_tail(&rsv->blocks, &mgr->reservations_pending);
> +     if (ret == -ENOENT)
> +             list_add_tail(&rsv->blocks, &mgr->reservations_pending);
>       amdgpu_vram_mgr_do_reserve(&mgr->manager);
>       mutex_unlock(&mgr->lock);
>





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux