On 11/26/18 1:57 AM, Zhou, David(ChunMing) wrote: > > >> -----Original Message----- >> From: Christian König <ckoenig.leichtzumerken@xxxxxxxxx> >> Sent: Monday, November 26, 2018 5:23 PM >> To: Sharma, Deepak <Deepak.Sharma@xxxxxxx>; Zhou, David(ChunMing) >> <David1.Zhou@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>; >> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >> Subject: Re: [PATCH] drm/amdgpu: add the checking to avoid NULL pointer >> dereference >> >> Am 26.11.18 um 02:59 schrieb Sharma, Deepak: >>> >>> 在 2018/11/24 2:10, Koenig, Christian 写道: >>>> Am 23.11.18 um 15:10 schrieb Zhou, David(ChunMing): >>>>> 在 2018/11/23 21:30, Koenig, Christian 写道: >>>>>> Am 23.11.18 um 14:27 schrieb Zhou, David(ChunMing): >>>>>>> 在 2018/11/22 19:25, Christian König 写道: >>>>>>>> Am 22.11.18 um 07:56 schrieb Sharma, Deepak: >>>>>>>>> when returned fence is not valid mostly due to userspace ignored >>>>>>>>> previous error causes NULL pointer dereference. >>>>>>>> Again, this is clearly incorrect. The my other mails on the >>>>>>>> earlier patch. >>>>>>> Sorry for I didn't get your history, but looks from the patch >>>>>>> itself, it is still a valid patch, isn't it? >>>>>> No, the semantic of amdgpu_ctx_get_fence() is that we return NULL >>>>>> when the fence is already signaled. >>>>>> >>>>>> So this patch could totally break userspace because it changes the >>>>>> behavior when we try to sync to an already signaled fence. >>>>> Ah, I got your meaning, how about attached patch? >>>> Yeah something like this, but I would just give the >>>> DRM_SYNCOBJ_CREATE_SIGNALED instead. >>>> >>>> I mean that's what this flag is good for isn't it? >>> Yeah, I give a flag initally when creating patch, but as you know, there is a >> swich case not be able to use that flag: >>> >>> case AMDGPU_FENCE_TO_HANDLE_GET_SYNC_FILE_FD: >>> fd = get_unused_fd_flags(O_CLOEXEC); >>> if (fd < 0) { >>> dma_fence_put(fence); >>> return fd; >>> } >>> >>> sync_file = sync_file_create(fence); >>> dma_fence_put(fence); >>> if (!sync_file) { >>> put_unused_fd(fd); >>> return -ENOMEM; >>> } >>> >>> fd_install(fd, sync_file->file); >>> info->out.handle = fd; >>> return 0; >>> >>> So I change to stub fence instead. >> >> Yeah, I've missed that case. Not sure if the sync_file can deal with a NULL >> fence. >> >> We should then probably move the stub fence function into >> dma_fence_stub.c under drivers/dma-buf to keep the stuff together. > > Yes, you wrap it to review first with your stub fence, we can do it separately first. > > -David >> >>> >>> -David >>> >>> I have not applied this patch. >>> The issue was trying to address is when amdgpu_cs_ioctl() failed due to >> low memory (ENOMEM) but userspace chose to proceed and called >> amdgpu_cs_fence_to_handle_ioctl(). >>> In amdgpu_cs_fence_to_handle_ioctl(), fence is null and later causing >>> NULL pointer dereference, this patch was to avoid that and system panic >> But I understand now that its a valid case retuning NULL if fence was already >> signaled but need to handle case so avoid kernel panic. Seems David patch >> should fix this, I will test it tomorrow. >> >> Mhm, but don't we bail out with an error if we ask for a failed command >> submission? If not that sounds like a bug as well. >> >> Christian. >> Where do we do that? I see error [drm:amdgpu_cs_ioctl] *ERROR* amdgpu_cs_list_validate(validated) failed. [drm:amdgpu_cs_ioctl] *ERROR* Not enough memory for command submission! BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 Did some more debugging,dma_fence_is_array() is causing NULL pointer dereference call through sync_file_ioctl. Also I think changes in David patch cant be applied on amd-staging-drm-next, which all patches I should take to get it correctly? >>> >>> -Deepak >>>> Christian. >>>> >>>>> -David >>>>>> If that patch was applied then please revert it immediately. >>>>>> >>>>>> Christian. >>>>>> >>>>>>> -David >>>>>>>> If you have already pushed the patch then please revert. >>>>>>>> >>>>>>>> Christian. >>>>>>>> >>>>>>>>> Signed-off-by: Deepak Sharma <Deepak.Sharma@xxxxxxx> >>>>>>>>> --- >>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ >>>>>>>>> 1 file changed, 2 insertions(+) >>>>>>>>> >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>>>>>>> index 024dfbd87f11..14166cd8a12f 100644 >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>>>>>>> @@ -1403,6 +1403,8 @@ static struct dma_fence >>>>>>>>> *amdgpu_cs_get_fence(struct amdgpu_device *adev, >>>>>>>>> fence = amdgpu_ctx_get_fence(ctx, entity, user->seq_no); >>>>>>>>> amdgpu_ctx_put(ctx); >>>>>>>>> + if(!fence) >>>>>>>>> + return ERR_PTR(-EINVAL); >>>>>>>>> return fence; >>>>>>>>> } >>> _______________________________________________ >>> amd-gfx mailing list >>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx