Am 2022-07-07 um 09:39 schrieb philip yang:
On 2022-07-07 06:28, xinhui pan wrote:
Queue would be freed when create_queue_cpsch fails
So lets do queue cleanup otherwise various list and memory issues
happen.
This bug was introduced when adding MES support, as we used to ignore
execute_queues_cpsch return value. Cleanup and return error to user
space looks good to me.
This is similar to the queue destroy failure you looked at. A failure in
execute_queues_cpsch doesn't really indicate that the queue creation
failed. There is nothing specifically wrong with this queue. It just
means that HWS is probably hanging. So this problem will be handled with
a GPU reset anyway.
Regards,
Felix
Reviewed-by: Philip Yang <Philip.Yang@xxxxxxx>
Signed-off-by: xinhui pan<xinhui.pan@xxxxxxx>
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 93a0b6995430..e83725a28106 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1674,14 +1674,13 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
if (q->properties.is_active) {
increment_queue_count(dqm, qpd, q);
- if (!dqm->dev->shared_resources.enable_mes) {
+ if (!dqm->dev->shared_resources.enable_mes)
retval = execute_queues_cpsch(dqm,
- KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0);
- } else {
+ KFD_UNMAP_QUEUES_FILTER_DYNAMIC_QUEUES, 0);
+ else
retval = add_queue_mes(dqm, q, qpd);
- if (retval)
- goto cleanup_queue;
- }
+ if (retval)
+ goto cleanup_queue;
}
/*