[AMD Official Use Only - General] The current ROCr patches already address my previous feedback. I am ok with the current ROCr patches. Currently, there is no ROCrtst that would stress this multiple-waiters issue. I was thinking something like the KFDTest, but with by calling the waiters from different threads. @Zhu, James Would you have time to look into this? ~David > -----Original Message----- > From: Kuehling, Felix <Felix.Kuehling@xxxxxxx> > Sent: Friday, June 9, 2023 6:44 PM > To: Zhu, James <James.Zhu@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Yat Sin, David <David.YatSin@xxxxxxx>; Zhu, James > <James.Zhu@xxxxxxx> > Subject: Re: [PATCH v5 3/5] drm/amdkfd: set activated flag true when event > age unmatchs > > From the KFD perspective, the series is > > Reviewed-by: Felix Kuehling <Felix.Kuehling@xxxxxxx> > > David, I looked at the ROCr and Thunk changes as well, and they look > reasonable to me. Do you have any feedback on these patches from a ROCr > point of view? Is there a reasonable stress test that could be used check that > this handles the race conditions as expected and allows all waiters to sleep? > > Regards, > Felix > > > On 2023-06-09 16:43, James Zhu wrote: > > Set waiter's activated flag true when event age unmatchs with > last_event_age. > > > > -v4: add event type check > > -v5: improve on event age enable and activated flags > > > > Signed-off-by: James Zhu <James.Zhu@xxxxxxx> > > --- > > drivers/gpu/drm/amd/amdkfd/kfd_events.c | 17 +++++++++++++---- > > 1 file changed, 13 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c > > b/drivers/gpu/drm/amd/amdkfd/kfd_events.c > > index c7689181cc22..b2586a1dd35d 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c > > @@ -41,6 +41,7 @@ struct kfd_event_waiter { > > wait_queue_entry_t wait; > > struct kfd_event *event; /* Event to wait for */ > > bool activated; /* Becomes true when event is signaled */ > > + bool event_age_enabled; /* set to true when last_event_age is > > +non-zero */ > > }; > > > > /* > > @@ -797,9 +798,9 @@ static struct kfd_event_waiter > > *alloc_event_waiters(uint32_t num_events) > > > > static int init_event_waiter(struct kfd_process *p, > > struct kfd_event_waiter *waiter, > > - uint32_t event_id) > > + struct kfd_event_data *event_data) > > { > > - struct kfd_event *ev = lookup_event_by_id(p, event_id); > > + struct kfd_event *ev = lookup_event_by_id(p, event_data->event_id); > > > > if (!ev) > > return -EINVAL; > > @@ -808,6 +809,15 @@ static int init_event_waiter(struct kfd_process *p, > > waiter->event = ev; > > waiter->activated = ev->signaled; > > ev->signaled = ev->signaled && !ev->auto_reset; > > + > > + /* last_event_age = 0 reserved for backward compatible */ > > + if (waiter->event->type == KFD_EVENT_TYPE_SIGNAL && > > + event_data->signal_event_data.last_event_age) { > > + waiter->event_age_enabled = true; > > + if (ev->event_age != event_data- > >signal_event_data.last_event_age) > > + waiter->activated = true; > > + } > > + > > if (!waiter->activated) > > add_wait_queue(&ev->wq, &waiter->wait); > > spin_unlock(&ev->lock); > > @@ -948,8 +958,7 @@ int kfd_wait_on_events(struct kfd_process *p, > > goto out_unlock; > > } > > > > - ret = init_event_waiter(p, &event_waiters[i], > > - event_data.event_id); > > + ret = init_event_waiter(p, &event_waiters[i], &event_data); > > if (ret) > > goto out_unlock; > > }