On 2023-06-01 16:47, James Zhu wrote:
Don't sleep when event age unmatch, and update last_event_age.
It is only for KFD_EVENT_TYPE_SIGNAL which is checked by user space.
Signed-off-by: James Zhu <James.Zhu@xxxxxxx>
---
drivers/gpu/drm/amd/amdkfd/kfd_events.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
index b27a79c5f826..23e154811471 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
@@ -964,6 +964,19 @@ int kfd_wait_on_events(struct kfd_process *p,
event_data.event_id);
if (ret)
goto out_unlock;
+
+ /* last_event_age = 0 reserved for backward compatible */
+ if (event_data.last_event_age &&
+ event_waiters[i].event->event_age != event_data.last_event_age) {
+ event_data.last_event_age = event_waiters[i].event->event_age;
+ WRITE_ONCE(event_waiters[i].activated, true);
I think this is probably not necessary if you're returning before the
first call to test_event_condition.
+
+ if (copy_to_user(&events[i], &event_data,
+ sizeof(struct kfd_event_data))) {
+ ret = -EFAULT;
+ goto out_unlock;
+ }
+ }
When waiting on multiple events, it would be more efficient to catch all
events with outdated age in a single system call, instead of returning
after finding the first one. Then return after the loop if it found one
or more outdated events.
Also EFAULT is the wrong error code. It means "bad address". I think we
could return 0 here because there is really no error. It's just a normal
condition to allow user mode to update its event information before
going to sleep. If you want a non-0 return code, I'd recommend -EDEADLK,
because sleeping without outdated event information can lead to
deadlocks. We'd also need to translate that into a meaningful status
code in the Thunk to let ROCr know what's going on. I don't see a
suitable status code in the current Thunk API for this.
Regards,
Felix
}
/* Check condition once. */