[AMD Official Use Only] > -----Original Message----- > From: Kuehling, Felix <Felix.Kuehling@xxxxxxx> > Sent: Tuesday, March 15, 2022 2:25 AM > To: Zhou1, Tao <Tao.Zhou1@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Zhang, > Hawking <Hawking.Zhang@xxxxxxx>; Yang, Stanley > <Stanley.Yang@xxxxxxx>; Chai, Thomas <YiPeng.Chai@xxxxxxx> > Subject: Re: [PATCH 1/3] drm/amdkfd: update parameter for > event_interrupt_poison_consumption > > Am 2022-03-14 um 03:03 schrieb Tao Zhou: > > Other parameters can be gotten from ih_ring_entry, so only inputting > > ih_ring_entry is enough. > > I'm not sure what's the reason for this change. You remove one parameter, but > end up duplicating the SOC15_..._FROM_IH_RING_ENTRY translations. It > doesn't look like a net improvement to me. [Tao] source_id/pasid/client_id will be transferred and I'd like to reduce the number of parameters, I'll drop the change. > > Looking at this function a bit more, this code looks problematic: > > if (atomic_read(&p->poison)) { > kfd_unref_process(p); > return; > } > > atomic_set(&p->poison, 1); > kfd_unref_process(p); > > Doing the read and set as two separate operations is not atomic. You should use > atomic_cmpxchg here to make sure the poison-consumption is handled only > once: > > old_poison = atomic_cmpxchg(&p->poison, 0, 1); > kfd_unref_process(p); > if (old_poison) > return; > /* handle poison consumption */ > > Alternatively you could use atomic_inc_return and do the poison handling only if > that returns exactly 1. [Tao] thanks, accepted. > > Regards, > Felix > > > > > > Signed-off-by: Tao Zhou <tao.zhou1@xxxxxxx> > > --- > > drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 13 +++++++++---- > > 1 file changed, 9 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c > > b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c > > index 7eedbcd14828..f7def0bf0730 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c > > @@ -91,11 +91,16 @@ enum SQ_INTERRUPT_ERROR_TYPE { > > #define KFD_SQ_INT_DATA__ERR_TYPE__SHIFT 20 > > > > static void event_interrupt_poison_consumption(struct kfd_dev *dev, > > - uint16_t pasid, uint16_t source_id) > > + const uint32_t *ih_ring_entry) > > { > > + uint16_t source_id, pasid; > > int ret = -EINVAL; > > - struct kfd_process *p = kfd_lookup_process_by_pasid(pasid); > > + struct kfd_process *p; > > > > + source_id = SOC15_SOURCE_ID_FROM_IH_ENTRY(ih_ring_entry); > > + pasid = SOC15_PASID_FROM_IH_ENTRY(ih_ring_entry); > > + > > + p = kfd_lookup_process_by_pasid(pasid); > > if (!p) > > return; > > > > @@ -270,7 +275,7 @@ static void event_interrupt_wq_v9(struct kfd_dev *dev, > > sq_intr_err); > > if (sq_intr_err != > SQ_INTERRUPT_ERROR_TYPE_ILLEGAL_INST && > > sq_intr_err != > SQ_INTERRUPT_ERROR_TYPE_MEMVIOL) { > > - > event_interrupt_poison_consumption(dev, pasid, source_id); > > + > event_interrupt_poison_consumption(dev, ih_ring_entry); > > return; > > } > > break; > > @@ -291,7 +296,7 @@ static void event_interrupt_wq_v9(struct kfd_dev *dev, > > if (source_id == SOC15_INTSRC_SDMA_TRAP) { > > kfd_signal_event_interrupt(pasid, context_id0 & > 0xfffffff, 28); > > } else if (source_id == SOC15_INTSRC_SDMA_ECC) { > > - event_interrupt_poison_consumption(dev, pasid, > source_id); > > + event_interrupt_poison_consumption(dev, > ih_ring_entry); > > return; > > } > > } else if (client_id == SOC15_IH_CLIENTID_VMC ||