Am 10.09.2018 um 15:05 schrieb Tom St Denis: > On 2018-09-10 9:04 a.m., Christian König wrote: >> Hi Tom, >> >> I'm talking about adding new printks to figure out what the heck is >> going wrong here. >> >> Thanks, >> Christian. > > Hi Christian, > > Sure, if you want to send me a simple patch that adds more printk I'll > gladly give it a try (doubly so since my workstation depends on our > staging tree to work properly...). Just add a printk to ttm_bo_bulk_move_helper to print pos->first and pos->last. And another one to amdgpu_bo_destroy to printk the value of tbo. Christian. > > Tom > >> >> Am 10.09.2018 um 14:59 schrieb Tom St Denis: >>> Hi Christian, >>> >>> Are you adding new traces or turning on existing ones? Would you >>> like me to try them out in my setup? >>> >>> Tom >>> >>> On 2018-09-10 8:49 a.m., Christian König wrote: >>>> Am 10.09.2018 um 14:05 schrieb Huang Rui: >>>>> On Mon, Sep 10, 2018 at 05:25:48PM +0800, Koenig, Christian wrote: >>>>>> Am 10.09.2018 um 11:23 schrieb Huang Rui: >>>>>>> On Mon, Sep 10, 2018 at 11:00:04AM +0200, Christian König wrote: >>>>>>>> Hi Ray, >>>>>>>> >>>>>>>> well those patches doesn't make sense, the pointer is only >>>>>>>> local to >>>>>>>> the function. >>>>>>> You're right. >>>>>>> I narrowed it with gdb dump from ttm_bo_bulk_move_lru_tail+0x2b, >>>>>>> the >>>>>>> use-after-free should be in below codes: >>>>>>> >>>>>>> man = &bulk->tt[i].first->bdev->man[TTM_PL_TT]; >>>>>>> ttm_bo_bulk_move_helper(&bulk->tt[i], &man->lru[i], false); >>>>>>> >>>>>>> Is there a case, when orignal bo is destroyed in the bulk pos, >>>>>>> but it >>>>>>> doesn't update pos->first pointer, then we still use it during >>>>>>> the bulk >>>>>>> moving? >>>>>> Only when a per VM BO is freed or the VM destroyed. >>>>>> >>>>>> The first case should now be handled by "drm/amdgpu: set >>>>>> bulk_moveable >>>>>> to false when a per VM is released" and when we use a destroyed >>>>>> VM we >>>>>> would see other problems as well. >>>>>> >>>>> If a VM instance is teared down, all BOs which belong that VM >>>>> should be >>>>> removed from LRU. But how can we submit cmd based on a destroyed >>>>> VM? You >>>>> know, we do the bulk move at last step of submission. >>>> >>>> Well exactly that's the point this can't happen :) >>>> >>>> Otherwise we would crash because of using freed up memory much >>>> earlier in the command submission. >>>> >>>> The best idea I have to track this down further is to add some >>>> trace_printk in ttm_bo_bulk_move_helper and amdgpu_bo_destroy and >>>> see why and when we are actually using a destroyed BO. >>>> >>>> Christian. >>>> >>>>> >>>>> >>>>> Thanks, >>>>> Ray >>>>> >>>>>> BTW: Just pushed this commit to the repository, should show up >>>>>> any second. >>>>>> >>>>>> Christian. >>>>>> >>>>>>> Thanks, >>>>>>> Ray >>>>>>> >>>>>>>> Regards, >>>>>>>> Christian. >>>>>>>> >>>>>>>> Am 10.09.2018 um 10:57 schrieb Huang Rui: >>>>>>>>> It avoids to be refered again after freed. >>>>>>>>> >>>>>>>>> Signed-off-by: Huang Rui <ray.huang at amd.com> >>>>>>>>> Cc: Christian König <christian.koenig at amd.com> >>>>>>>>> Cc: Tom StDenis <Tom.StDenis at amd.com> >>>>>>>>> --- >>>>>>>>>   drivers/gpu/drm/ttm/ttm_bo.c | 1 + >>>>>>>>>   1 file changed, 1 insertion(+) >>>>>>>>> >>>>>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c >>>>>>>>> b/drivers/gpu/drm/ttm/ttm_bo.c >>>>>>>>> index 138c989..d3ef5f8 100644 >>>>>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c >>>>>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >>>>>>>>> @@ -54,6 +54,7 @@ static struct attribute ttm_bo_count = { >>>>>>>>>   static void ttm_bo_default_destroy(struct ttm_buffer_object >>>>>>>>> *bo) >>>>>>>>>   { >>>>>>>>>       kfree(bo); >>>>>>>>> +   bo = NULL; >>>>>>>>>   } >>>>>>>>>   static inline int ttm_mem_type_from_place(const struct >>>>>>>>> ttm_place *place, >>>>>>>> _______________________________________________ >>>>>>>> amd-gfx mailing list >>>>>>>> amd-gfx at lists.freedesktop.org >>>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>>>> _______________________________________________ >>>>> dri-devel mailing list >>>>> dri-devel at lists.freedesktop.org >>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel >>>> >>> >>> _______________________________________________ >>> amd-gfx mailing list >>> amd-gfx at lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >> >