> -----Original Message----- > From: Paul Menzel <pmenzel@xxxxxxxxxxxxx> > Sent: Tuesday, April 19, 2022 2:54 AM > To: Yat Sin, David <David.YatSin@xxxxxxx> > Cc: Kuehling, Felix <Felix.Kuehling@xxxxxxx>; amd- > gfx@xxxxxxxxxxxxxxxxxxxxx > Subject: Re: [PATCH v2 2/2] drm/amdkfd: CRIU add support for GWS queues > > > Dear David, > > > Thank you for sending out v3 of these patches. > > Am 19.04.22 um 02:04 schrieb Yat Sin, David: > > > > > >> -----Original Message----- > >> From: Paul Menzel <pmenzel@xxxxxxxxxxxxx> > >> Sent: Monday, April 18, 2022 4:23 PM > > […] > >> Am 18.04.22 um 18:44 schrieb David Yat Sin: > >> > >> In the commit message summary, you could reorder some words: > >> > >> Add CRIU support for GWS queues > >> > >>> Adding support to checkpoint/restore GWS(Global Wave Sync) queues. > >> > >> s/Adding/Add/ > > Did you miss the two comments above? ACK > > >> Please add a space before the (. > > ACK > >> > >> How can this be tested? > > We have some internal tests that can we be used to specifically test this > feature. > > Nice. Are you going to publish these in the future? I think some of these tests depend on other frameworks, so it might not be straight forward to do this. > > >>> Signed-off-by: David Yat Sin <david.yatsin@xxxxxxx> > >>> --- > >>> drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +- > >>> drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 10 > +++++++--- > >>> 2 files changed, 8 insertions(+), 4 deletions(-) > >>> > >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > >>> b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > >>> index f36062be9ca8..192dbef04c43 100644 > >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > >>> @@ -1102,7 +1102,7 @@ struct kfd_criu_queue_priv_data { > >>> uint32_t priority; > >>> uint32_t q_percent; > >>> uint32_t doorbell_id; > >>> - uint32_t is_gws; > >>> + uint32_t gws; > >> > >> Why is the new name better? > > The old variable (is_gws) was obtained from the queue_properties > > structure during checkpoint and is only used temporarily during queue > > creation, so this variable cannot be used to determine whether a queue > > as gws enabled. The new variable (gws) is obtained from the queue > > structure. The name is changed to better reflect this. > > Further down you seem to use it like a boolean though. So a name reflecting > that would be nice. To me this is ok. I would rather have the variable name match its source. > > >>> uint32_t sdma_id; > >>> uint32_t eop_ring_buffer_size; > >>> uint32_t ctx_save_restore_area_size; diff --git > >>> a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c > >>> b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c > >>> index 6eca9509f2e3..4f58e671d39b 100644 > >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c > >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c > >>> @@ -636,6 +636,8 @@ static int criu_checkpoint_queue(struct > >> kfd_process_device *pdd, > >>> q_data->ctx_save_restore_area_size = > >>> q->properties.ctx_save_restore_area_size; > >>> > >>> + q_data->gws = !!q->gws; > >>> + > >>> ret = pqm_checkpoint_mqd(&pdd->process->pqm, q-> > properties.queue_id, mqd, ctl_stack); > >>> if (ret) { > >>> pr_err("Failed checkpoint queue_mqd (%d)\n", ret); @@ - > 743,7 > >>> +745,6 @@ static void set_queue_properties_from_criu(struct > >>> +queue_properties *qp, > >>> struct kfd_criu_queue_priv_data > *q_data) > >>> { > >>> qp->is_interop = false; > >>> - qp->is_gws = q_data->is_gws; > >>> qp->queue_percent = q_data->q_percent; > >>> qp->priority = q_data->priority; > >>> qp->queue_address = q_data->q_address; @@ -826,12 +827,15 @@ > >> int kfd_criu_restore_queue(struct kfd_process *p, > >>> NULL); > >>> if (ret) { > >>> pr_err("Failed to create new queue err:%d\n", ret); > >>> - ret = -EINVAL; > >>> + goto exit; > >>> } > >>> > >>> + if (q_data->gws) > >>> + ret = pqm_set_gws(&p->pqm, q_data->q_id, pdd->dev->gws); > >>> + > >>> exit: > >>> if (ret) > >>> - pr_err("Failed to create queue (%d)\n", ret); > >>> + pr_err("Failed to restore queue (%d)\n", ret); > >> > >> Maybe separate this out, so it can be applied to stable series. > > Did you miss this comment? What do you mean by stable series? > > >>> else > >>> pr_debug("Queue id %d was restored successfully\n", > queue_id); > >>> > > > Kind regards, > > Paul