[AMD Official Use Only - AMD Internal Distribution Only] Hi Christian, -----Original Message----- From: Christian König <ckoenig.leichtzumerken@xxxxxxxxx> Sent: Monday, May 13, 2024 7:41 PM To: Zhang, Jesse(Jie) <Jesse.Zhang@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Koenig, Christian <Christian.Koenig@xxxxxxx>; Huang, Tim <Tim.Huang@xxxxxxx> Subject: Re: [PATCH 01/22] drm/amdgpu: fix dereference after null check Am 10.05.24 um 04:50 schrieb Jesse Zhang: > check the pointer hive before use. > > Signed-off-by: Jesse Zhang <Jesse.Zhang@xxxxxxx> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 77f6fd50002a..00fe3c2d5431 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -5725,7 +5725,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, > * to put adev in the 1st position. > */ > INIT_LIST_HEAD(&device_list); > - if (!amdgpu_sriov_vf(adev) && (adev->gmc.xgmi.num_physical_nodes > 1)) { > + if (!amdgpu_sriov_vf(adev) && (adev->gmc.xgmi.num_physical_nodes > > +1) && hive) { That solution looks not optimal to me. Checking adev->gmc.xgmi.num_physical_nodes > 1 already makes sure that hive shouldn't be NULL. If automated checkers complain about that we should probably drop the adev->gmc.xgmi.num_physical_nodes > 1 check and check for hive instead. [Zhang, Jesse(Jie)] gmc.xgmi.num_physical_nodes is obtained by reading register GCMC_VM_XGMI_LFB_CNTL. But getting hive may fail because of no memory (NOMEM), or the kobject of xgmi hive cannot be initialized in the function amdgpu_get_xgmi_hive. Is (adev->gmc.xgmi.num_physical_nodes > 1) equivalent to (!hive) here? Regards Jesse Regards, Christian. > list_for_each_entry(tmp_adev, &hive->device_list, gmc.xgmi.head) { > list_add_tail(&tmp_adev->reset_list, &device_list); > if (adev->shutdown)