Re: [PATCH 1/2] drm/amdgpu: Add SDMA queue start/stop callbacks to amdgpu_ring_funcs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 11.03.25 um 09:32 schrieb Jesse.zhang@xxxxxxx:
> From: "Jesse.zhang@xxxxxxx" <Jesse.zhang@xxxxxxx>
>
> This patch introduces two new callbacks, `stop_queue` and `start_queue`, to the
> `amdgpu_ring_funcs` structure. These callbacks are designed to handle the stopping
> and starting of SDMA queues during engine reset operations. The changes include:
>
> 1. **Addition of Callbacks**:
>    - Added `stop_queue` and `start_queue` function pointers to `amdgpu_ring_funcs`.
>    - These callbacks allow for modular and flexible management of SDMA queues during
>      reset operations.

Why does that needs to be per ring callbacks?

Flexibility is usually something bad when not needed.

Regards,
Christian.

>
> 2. **Integration with SDMA v4.4.2**:
>    - Implemented `sdma_v4_4_2_stop_queue` and `sdma_v4_4_2_restore_queue` as the
>      respective callback functions for SDMA v4.4.2.
>    - These functions handle the stopping and starting of SDMA queues, ensuring that
>      the scheduler's work queue is properly managed during resets.
>
> 3. **Purpose**:
>    - The new callbacks provide a standardized way to stop and start SDMA queues,
>      which is essential for handling engine resets gracefully.
>    - This change simplifies the reset logic and improves maintainability by
>      centralizing queue management in the `amdgpu_ring_funcs` structure.
>
> 4. **Impact**:
>    - The addition of these callbacks ensures that SDMA queues are properly stopped
>      and started during reset operations, reducing the risk of race conditions and
>      improving the reliability of the reset process.
>    - This change is a prerequisite for future improvements to the SDMA reset logic,
>      including better coordination between the KGD and KFD during resets.
>
> Suggested-by:Jonathan Kim <jonathan.kim@xxxxxxx>
> Signed-off-by: Jesse Zhang <Jesse.Zhang@xxxxxxx>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 ++
>  drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 2 ++
>  2 files changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index b4fd1e17205e..1c52ff92ea26 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -237,6 +237,8 @@ struct amdgpu_ring_funcs {
>  	void (*patch_ce)(struct amdgpu_ring *ring, unsigned offset);
>  	void (*patch_de)(struct amdgpu_ring *ring, unsigned offset);
>  	int (*reset)(struct amdgpu_ring *ring, unsigned int vmid);
> +	int (*stop_queue)(struct amdgpu_device *adev, uint32_t instance_id);
> +	int (*start_queue)(struct amdgpu_device *adev, uint32_t instance_id);
>  	void (*emit_cleaner_shader)(struct amdgpu_ring *ring);
>  	bool (*is_guilty)(struct amdgpu_ring *ring);
>  };
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
> index fd34dc138081..c1f7ccff9c4e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
> @@ -2132,6 +2132,8 @@ static const struct amdgpu_ring_funcs sdma_v4_4_2_ring_funcs = {
>  	.emit_reg_wait = sdma_v4_4_2_ring_emit_reg_wait,
>  	.emit_reg_write_reg_wait = amdgpu_ring_emit_reg_write_reg_wait_helper,
>  	.reset = sdma_v4_4_2_reset_queue,
> +	.stop_queue = sdma_v4_4_2_stop_queue,
> +	.start_queue = sdma_v4_4_2_restore_queue,
>  	.is_guilty = sdma_v4_4_2_ring_is_guilty,
>  };
>  




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux