On 2019-06-03 7:23 a.m., Christian König wrote: > Am 03.06.19 um 12:17 schrieb Christian König: >> Am 01.06.19 um 00:01 schrieb Kuehling, Felix: >>> On 2019-05-31 5:32 p.m., Yang, Philip wrote: >>>> On 2019-05-31 3:42 p.m., Kuehling, Felix wrote: >>>>> On 2019-05-31 1:28 p.m., Yang, Philip wrote: >>>>>> On 2019-05-30 6:36 p.m., Kuehling, Felix wrote: >>>>>>>> #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR) >>>>>>>> - if (gtt->ranges && >>>>>>>> - ttm->pages[0] == hmm_pfn_to_page(>t->ranges[0], >>>>>>>> - gtt->ranges[0].pfns[0])) >>>>>>>> + if (gtt->range && >>>>>>>> + ttm->pages[0] == hmm_device_entry_to_page(gtt->range, >>>>>>>> + gtt->range->pfns[0])) >>>>>>> I think just checking gtt->range here is enough. If gtt->range is >>>>>>> not >>>>>>> NULL here, we're leaking memory. >>>>>>> >>>>>> If just checking gtt->range, there is a false warning in amdgpu_test >>>>>> userptr case in amdgpu_cs_list_validate path. If userptr is >>>>>> invalidated, >>>>>> then ttm->pages[0] is outdated pages, lobj->user_pages is new >>>>>> pages, it >>>>>> goes to ttm_tt_unbind first to unpin old pages (this causes false >>>>>> warning) then call amdgpu_ttm_tt_set_user_pages. >>>>> But doesn't that mean we're leaking the gtt->range somewhere? >>>>> >>>> ttm_tt_unbind is called from ttm_tt_destroy and >>>> amdgpu_cs_list_validate, >>>> the later one causes false warning. ttm_ttm_destory path is fine to >>>> only >>>> check gtt->range. >>>> >>>> Double checked, amdgpu_ttm_tt_get_user_pages and >>>> amdgpu_ttm_tt_get_user_pages_done always match in both paths, so no >>>> leak >>>> gtt->range. >>>> >>>> 1. amdgpu_gem_userptr_ioctl >>>> amdgpu_ttm_tt_get_user_pages >>>> ttm->pages for userptr pages >>>> amdgpu_ttm_tt_get_user_pages_done >>>> >>>> 2. amdgpu_cs_ioctl >>>> amdgpu_cs_parser_bos >>>> amdgpu_ttm_tt_get_user_pages >>>> if (userpage_invalidated) >>>> e->user_pages for new pages >>>> amdgpu_cs_list_validate >>>> if (userpage_invalidated) >>>> ttm_tt_unbind ttm->pages // this causes warning >>>> amdgpu_ttm_tt_set_user_pages(ttm->pages, e->user_pages) >>> Hmm, I think amdgpu_cs is doing something weird there. It does some >>> double book-keeping of the user pages in the BO list and the TTM BO. We >>> did something similar for KFD and simplified it when moving to HMM. It >>> could probably be simplified for amdgpu_cs as well. But not in this >>> patch series. >> >> That actually sounds like a bug to me. >> >> It is most likely a leftover from the time when we couldn't get the >> pages for a BO while the BO was reserved. > > Mhm, at least it's not racy in the way I thought it would be. But it is > certainly still overkill and should be simplified. > > Philip are you taking a look or should I tackle this? > Hi Christian, I will submit another patch to simplify amdgpu_cs_ioctl path, please help review it. Thanks, Philip > Thanks, > Christian. > >> >> >> Going to take a closer look, >> Christian. >> >>> >>> I'll review your updated change. >>> >>> Thanks, >>> Felix >>> >>> >>>> amdgpu_cs_submit >>>> amdgpu_ttm_tt_get_user_pages_done >>>> >>>>> Regards, >>>>> Felix >>>>> >>>>> >>>>>> I will submit patch v2, to add retry if hmm_range_fault returns >>>>>> -EAGAIN. >>>>>> use kzalloc to allocate small size range. >>>>>> >>>>>> Thanks, >>>>>> Philip >>>>>> >>>>>>> Regards, >>>>>>> Felix >>>>>>> >>>>>>> >>> _______________________________________________ >>> amd-gfx mailing list >>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >> > _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx