On 11/30/2018 03:08 PM, Alex Deucher wrote: > On Fri, Nov 30, 2018 at 3:06 PM Grodzovsky, Andrey > <Andrey.Grodzovsky@xxxxxxx> wrote: >> >> >> On 11/30/2018 02:49 PM, Alex Deucher wrote: >>> On Fri, Nov 30, 2018 at 1:17 PM Andrey Grodzovsky >>> <andrey.grodzovsky@xxxxxxx> wrote: >>>> XGMI hive has some resources allocted on device init which >>>> needs to be deallocated when the device is unregistered. >>>> >>>> v2: Remove creation of dedicated wq for XGMI hive reset. >>>> >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx> >>>> --- >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++ >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 20 ++++++++++++++++++++ >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 1 + >>>> 3 files changed, 24 insertions(+) >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>> index c75badf..bfd286c 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>> @@ -1864,6 +1864,9 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev) >>>> { >>>> int i, r; >>>> >>>> + if (adev->gmc.xgmi.num_physical_nodes > 1) >>>> + amdgpu_xgmi_remove_device(adev); >>>> + >>>> amdgpu_amdkfd_device_fini(adev); >>>> >>>> amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE); >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c >>>> index fb37e69..38e1599 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c >>>> @@ -135,3 +135,23 @@ int amdgpu_xgmi_add_device(struct amdgpu_device *adev) >>>> mutex_unlock(&xgmi_mutex); >>>> return ret; >>>> } >>>> + >>>> +void amdgpu_xgmi_remove_device(struct amdgpu_device *adev) >>>> +{ >>>> + struct amdgpu_hive_info *hive; >>>> + >>>> + if ((adev->asic_type < CHIP_VEGA20) || (adev->flags & AMD_IS_APU)) >>>> + return; >>> It would be nice to have something better here to check against. This >>> seems kind of fragile. Can we check based on some xgmi related >>> structure? >>> >>> Alex >> This check is same as in amdgpu_xgmi_add_device, i guess we could >> look if adev->gmc.xgmi.head is empty or not since it's only gets added to >> hive->device_list if amdgpu_xgmi_add_device was called. >> Sounds ok ? > Sounds good. We should probably fix up the other case as well at some point. > > Thanks! > > Alex The other case is in amdgpu_xgmi_add_device - at the beginning - just avoids XGMI code path for non VEGA20 dGPUs - how can we avoid this check ? Andrey > >> Andrey >> >>>> + >>>> + mutex_lock(&xgmi_mutex); >>>> + >>>> + hive = amdgpu_get_xgmi_hive(adev); >>>> + if (!hive) >>>> + goto exit; >>>> + >>>> + if (!(hive->number_devices--)) >>>> + mutex_destroy(&hive->hive_lock); >>>> + >>>> +exit: >>>> + mutex_unlock(&xgmi_mutex); >>>> +} >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h >>>> index 6335bfd..6151eb9 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h >>>> @@ -35,5 +35,6 @@ struct amdgpu_hive_info { >>>> struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev); >>>> int amdgpu_xgmi_update_topology(struct amdgpu_hive_info *hive, struct amdgpu_device *adev); >>>> int amdgpu_xgmi_add_device(struct amdgpu_device *adev); >>>> +void amdgpu_xgmi_remove_device(struct amdgpu_device *adev); >>>> >>>> #endif >>>> -- >>>> 2.7.4 >>>> >>>> _______________________________________________ >>>> amd-gfx mailing list >>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx