Re: [PATCH] drm/amdgpu: use new HMM APIs and helpers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 03.06.19 um 12:17 schrieb Christian König:
Am 01.06.19 um 00:01 schrieb Kuehling, Felix:
On 2019-05-31 5:32 p.m., Yang, Philip wrote:
On 2019-05-31 3:42 p.m., Kuehling, Felix wrote:
On 2019-05-31 1:28 p.m., Yang, Philip wrote:
On 2019-05-30 6:36 p.m., Kuehling, Felix wrote:
              #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR)
-    if (gtt->ranges &&
-        ttm->pages[0] == hmm_pfn_to_page(&gtt->ranges[0],
-                         gtt->ranges[0].pfns[0]))
+    if (gtt->range &&
+        ttm->pages[0] == hmm_device_entry_to_page(gtt->range,
+ gtt->range->pfns[0]))
I think just checking gtt->range here is enough. If gtt->range is not
NULL here, we're leaking memory.

If just checking gtt->range, there is a false warning in amdgpu_test
userptr case in amdgpu_cs_list_validate path. If userptr is invalidated, then ttm->pages[0] is outdated pages, lobj->user_pages is new pages, it
goes to ttm_tt_unbind first to unpin old pages (this causes false
warning) then call amdgpu_ttm_tt_set_user_pages.
But doesn't that mean we're leaking the gtt->range somewhere?

ttm_tt_unbind is called from ttm_tt_destroy and amdgpu_cs_list_validate, the later one causes false warning. ttm_ttm_destory path is fine to only
check gtt->range.

Double checked, amdgpu_ttm_tt_get_user_pages and
amdgpu_ttm_tt_get_user_pages_done always match in both paths, so no leak
gtt->range.

1. amdgpu_gem_userptr_ioctl
         amdgpu_ttm_tt_get_user_pages
             ttm->pages for userptr pages
         amdgpu_ttm_tt_get_user_pages_done

2. amdgpu_cs_ioctl
         amdgpu_cs_parser_bos
             amdgpu_ttm_tt_get_user_pages
             if (userpage_invalidated)
                 e->user_pages for new pages
             amdgpu_cs_list_validate
                 if (userpage_invalidated)
                    ttm_tt_unbind ttm->pages // this causes warning
amdgpu_ttm_tt_set_user_pages(ttm->pages, e->user_pages)
Hmm, I think amdgpu_cs is doing something weird there. It does some
double book-keeping of the user pages in the BO list and the TTM BO. We
did something similar for KFD and simplified it when moving to HMM. It
could probably be simplified for amdgpu_cs as well. But not in this
patch series.

That actually sounds like a bug to me.

It is most likely a leftover from the time when we couldn't get the pages for a BO while the BO was reserved.

Mhm, at least it's not racy in the way I thought it would be. But it is certainly still overkill and should be simplified.

Philip are you taking a look or should I tackle this?

Thanks,
Christian.



Going to take a closer look,
Christian.


I'll review your updated change.

Thanks,
    Felix


         amdgpu_cs_submit
             amdgpu_ttm_tt_get_user_pages_done

Regards,
      Felix


I will submit patch v2, to add retry if hmm_range_fault returns -EAGAIN.
use kzalloc to allocate small size range.

Thanks,
Philip

Regards,
        Felix


_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux