On 1/19/21 7:22 AM, Horace Chen wrote:
> Fix a racing issue when jobs on 2 rings timeout simultaneously.
>
> If 2 rings timed out at the same time, the amdgpu_device_gpu_recover
> will be reentered. Then the adev->gmc.xgmi.head will be grabbed
> by 2 local linked list, which may cause wild pointer issue in
> iterating.
>
> lock the device earily to prevent the node be added to 2 different
> lists.
>
> Signed-off-by: Horace Chen
<horace.chen@xxxxxxx>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 42 +++++++++++++++-------
> 1 file changed, 30 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 4d434803fb49..9574da3abc32 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4540,6 +4540,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
> int i, r = 0;
> bool need_emergency_restart = false;
> bool audio_suspended = false;
> + bool get_dev_lock = false;
>
> /*
> * Special case: RAS triggered and full reset isn't supported
> @@ -4582,28 +4583,45 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
> * Build list of devices to reset.
> * In case we are in XGMI hive mode, resort the device list
> * to put adev in the 1st position.
> + *
> + * lock the device before we try to operate the linked list
> + * if didn't get the device lock, don't touch the linked list since
> + * others may iterating it.
> */
> INIT_LIST_HEAD(&device_list);
> if (adev->gmc.xgmi.num_physical_nodes > 1) {
> if (!hive)
> return -ENODEV;
> - if (!list_is_first(&adev->gmc.xgmi.head, &hive->device_list))
> - list_rotate_to_front(&adev->gmc.xgmi.head, &hive->device_list);
> - device_list_handle = &hive->device_list;
> +
> + list_for_each_entry(tmp_adev, &hive->device_list, gmc.xgmi.head) {
> + get_dev_lock = amdgpu_device_lock_adev(tmp_adev, hive);
> + if (!get_dev_lock)
> + break;
What about unlocking back all the devices you already locked if the break
happens in the middle of the iteration ?
Note that at skip_recovery: we don't do it. BTW, i see this issue is already in
the current code.
Also, maybe now it's better to separate the actual locking in
amdgpu_device_lock_adev
from the other stuff going on there since I don't think you would wont to toggle
stuff
like adev->mp1_state back and forth and also the function name is not
descriptive of
the other stuff going on there anyway.
Andrey
> + }
> + if (get_dev_lock) {
> + if (!list_is_first(&adev->gmc.xgmi.head, &hive->device_list))
> + list_rotate_to_front(&adev->gmc.xgmi.head, &hive->device_list);
> + device_list_handle = &hive->device_list;
> + }
> } else {
> - list_add_tail(&adev->gmc.xgmi.head, &device_list);
> - device_list_handle = &device_list;
> + get_dev_lock = amdgpu_device_lock_adev(adev, hive);
> + tmp_adev = adev;
> + if (get_dev_lock) {
> + list_add_tail(&adev->gmc.xgmi.head, &device_list);
> + device_list_handle = &device_list;
> + }
> + }
> +
> + if (!get_dev_lock) {
> + dev_info(tmp_adev->dev, "Bailing on TDR for s_job:%llx, as another already in progress",
> + job ? job->base.id : -1);
> + r = 0;
> + /* even we skipped this reset, still need to set the job to guilty */
> + goto skip_recovery;
> }
>
> /* block all schedulers and reset given job's ring */
> list_for_each_entry(tmp_adev, device_list_handle, gmc.xgmi.head) {
> - if (!amdgpu_device_lock_adev(tmp_adev, hive)) {
> - dev_info(tmp_adev->dev, "Bailing on TDR for s_job:%llx, as another already in progress",
> - job ? job->base.id : -1);
> - r = 0;
> - goto skip_recovery;
> - }
> -
> /*
> * Try to put the audio codec into suspend state
> * before gpu reset started.