On 2018-08-29 10:57 a.m., Christian König wrote: > Am 29.08.2018 um 09:52 schrieb Michel Dänzer: >> On 2018-08-28 7:03 p.m., Michel Dänzer wrote: >>> On 2018-08-28 11:14 a.m., Michel Dänzer wrote: >>>> On 2018-08-22 9:52 a.m., Huang Rui wrote: >>>>> The new bulk moving functionality is ready, the overhead of moving >>>>> PD/PT bos to >>>>> LRU is fixed. So move them on LRU again. >>>>> >>>>> Signed-off-by: Huang Rui <ray.huang at amd.com> >>>>> Tested-by: Mike Lothian <mike at fireburn.co.uk> >>>>> Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de> >>>>> Acked-by: Chunming Zhou <david1.zhou at amd.com> >>>>> Reviewed-by: Junwei Zhang <Jerry.Zhang at amd.com> >>>>> --- >>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +- >>>>>  1 file changed, 1 insertion(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>> index db1f28a..d195a3d 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>> @@ -1107,7 +1107,7 @@ int amdgpu_vm_update_directories(struct >>>>> amdgpu_device *adev, >>>>>                         struct amdgpu_vm_bo_base, >>>>>                         vm_status); >>>>>          bo_base->moved = false; >>>>> -       list_del_init(&bo_base->vm_status); >>>>> +       list_move(&bo_base->vm_status, &vm->idle); >>>>>           bo = bo_base->bo->parent; >>>>>          if (!bo) >>>>> >>>> Since this change, I'm getting various badness when running piglit >>>> using >>>> radeonsi on Bonaire, see the attached dmesg excerpt. >>>> >>>> Reverting just this change on top of current amd-staging-drm-next >>>> avoids >>>> the problem. >>>> >>>> Looks like some list manipulation isn't sufficiently protected against >>>> concurrent execution? >>> KASAN pointed me to one issue: >>> https://patchwork.freedesktop.org/patch/246212/ >>> >>> However, this doesn't fully fix the problem. >> Ray, any ideas yet for solving this? If not, let's revert this change >> for now. > > I've gone over this multiple times now as well, but can't find anything > obvious wrong either. After looking at the code, one question: Why does vm->moved need a spinlock, but not vm->idle? What is protecting against concurrent access to the latter? -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer