[AMD Official Use Only - AMD Internal Distribution Only] > -----Original Message----- > From: Kuehling, Felix <Felix.Kuehling@xxxxxxx> > Sent: Wednesday, December 4, 2024 7:01 PM > To: Russell, Kent <Kent.Russell@xxxxxxx>; Martin, Andrew > <Andrew.Martin@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Tudor, Alexandru <Alexandru.Tudor@xxxxxxx> > Subject: Re: [PATCH v2] drm/amdkfd: Dereference null return value > > > On 2024-12-03 09:30, Russell, Kent wrote: > > > > [Public] > > > > > > > > > > ---------------------------------------------------------------------- > > -- > > *From:* amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> on behalf of > > Andrew Martin <Andrew.Martin@xxxxxxx> > > *Sent:* Monday, December 2, 2024 7:45:55 a.m. > > *To:* amd-gfx@xxxxxxxxxxxxxxxxxxxxx <amd-gfx@xxxxxxxxxxxxxxxxxxxxx> > > *Cc:* Kuehling, Felix <Felix.Kuehling@xxxxxxx>; Tudor, Alexandru > > <Alexandru.Tudor@xxxxxxx>; Martin, Andrew > <Andrew.Martin@xxxxxxx>; > > Martin, Andrew <Andrew.Martin@xxxxxxx> > > *Subject:* [PATCH v2] drm/amdkfd: Dereference null return value > > > > In the function pqm_uninit there is a call-assignment of "pdd = > > kfd_get_process_device_data" which could be null, and this value was > > later dereferenced without checking. > > > > Fixes: fb91065851cd ("drm/amdkfd: Refactor queue wptr_bo GART > > mapping") > > Signed-off-by: Andrew Martin <Andrew.Martin@xxxxxxx> > > --- > > drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 6 ++++-- > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c > > b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c > > index c76db22a1000..89aa578f6c68 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c > > @@ -212,11 +212,13 @@ static void pqm_clean_queue_resource(struct > > process_queue_manager *pqm, > > void pqm_uninit(struct process_queue_manager *pqm) > > { > > struct process_queue_node *pqn, *next; > > - struct kfd_process_device *pdd; > > > > list_for_each_entry_safe(pqn, next, &pqm->queues, > > process_queue_list) { > > if (pqn->q) { > > - pdd = > > kfd_get_process_device_data(pqn->q->device, pqm->process); > > + struct kfd_process_device *pdd = > > kfd_get_process_device_data(pqn->q->device, > > + pqm->process); > > + if (WARN_ON(!pdd)) > > > > Would we want a "continue" instead of "break" if the pdd is NULL? Just > > in case other ones in the list are still valid? Or is one NULL enough > > to just WARN and abort? > > I agree, we should use "continue" here. We are leaking memory, but let's not > leak more than necessary. With that fixed, the patch is > Greetings Kent/Felix, sending peace. Thanks, I will make the change, run the tests and send it out. > Reviewed-by: Felix Kuehling <felix.kuehling@xxxxxxx> > > > Thanks, > Felix > > > > > > Kent > > > > + return; > > kfd_queue_unref_bo_vas(pdd, > > &pqn->q->properties); > > kfd_queue_release_buffers(pdd, > > &pqn->q->properties); > > pqm_clean_queue_resource(pqm, pqn); > > -- > > 2.43.0 > > > >