Re: [PATCH v5 3/5] drm/amdkfd: set activated flag true when event age unmatchs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Testing for intermittent failures or race conditions is not easy. If we create such a test, we need to make sure it can catch the problem when not using the event ages, just to know that the test is good enough.

I guess it could be a parametrized test that can run with or without event age. Without event age, we'd expect to catch a timeout. Not catching a timeout would be a test failure (indicating that the test is not good enough). With event age it should not time out, i.e. a timeout would be considered a failure in this case (indicating a problem with the event age mechanism).

That said, I'd feel better about a ROCr test that doesn't just cover the KFD event age mechanism, but also its use in the ROCr implementation of HSA signal waiting.

Regards,
  Felix


Am 2023-06-12 um 12:19 schrieb Yat Sin, David:
[AMD Official Use Only - General]

The current ROCr patches already address my previous feedback. I am ok with the current ROCr patches.

Currently, there is no ROCrtst that would stress this multiple-waiters issue. I was thinking something like the KFDTest, but with by calling the waiters from different threads. @Zhu, James Would you have time to look into this?

~David

-----Original Message-----
From: Kuehling, Felix <Felix.Kuehling@xxxxxxx>
Sent: Friday, June 9, 2023 6:44 PM
To: Zhu, James <James.Zhu@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: Yat Sin, David <David.YatSin@xxxxxxx>; Zhu, James
<James.Zhu@xxxxxxx>
Subject: Re: [PATCH v5 3/5] drm/amdkfd: set activated flag true when event
age unmatchs

  From the KFD perspective, the series is

Reviewed-by: Felix Kuehling <Felix.Kuehling@xxxxxxx>

David, I looked at the ROCr and Thunk changes as well, and they look
reasonable to me. Do you have any feedback on these patches from a ROCr
point of view? Is there a reasonable stress test that could be used check that
this handles the race conditions as expected and allows all waiters to sleep?

Regards,
    Felix


On 2023-06-09 16:43, James Zhu wrote:
Set waiter's activated flag true when event age unmatchs with
last_event_age.
-v4: add event type check
-v5: improve on event age enable and activated flags

Signed-off-by: James Zhu <James.Zhu@xxxxxxx>
---
   drivers/gpu/drm/amd/amdkfd/kfd_events.c | 17 +++++++++++++----
   1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
index c7689181cc22..b2586a1dd35d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
@@ -41,6 +41,7 @@ struct kfd_event_waiter {
     wait_queue_entry_t wait;
     struct kfd_event *event; /* Event to wait for */
     bool activated;          /* Becomes true when event is signaled */
+   bool event_age_enabled;  /* set to true when last_event_age is
+non-zero */
   };

   /*
@@ -797,9 +798,9 @@ static struct kfd_event_waiter
*alloc_event_waiters(uint32_t num_events)

   static int init_event_waiter(struct kfd_process *p,
             struct kfd_event_waiter *waiter,
-           uint32_t event_id)
+           struct kfd_event_data *event_data)
   {
-   struct kfd_event *ev = lookup_event_by_id(p, event_id);
+   struct kfd_event *ev = lookup_event_by_id(p, event_data->event_id);

     if (!ev)
             return -EINVAL;
@@ -808,6 +809,15 @@ static int init_event_waiter(struct kfd_process *p,
     waiter->event = ev;
     waiter->activated = ev->signaled;
     ev->signaled = ev->signaled && !ev->auto_reset;
+
+   /* last_event_age = 0 reserved for backward compatible */
+   if (waiter->event->type == KFD_EVENT_TYPE_SIGNAL &&
+           event_data->signal_event_data.last_event_age) {
+           waiter->event_age_enabled = true;
+           if (ev->event_age != event_data-
signal_event_data.last_event_age)
+                   waiter->activated = true;
+   }
+
     if (!waiter->activated)
             add_wait_queue(&ev->wq, &waiter->wait);
     spin_unlock(&ev->lock);
@@ -948,8 +958,7 @@ int kfd_wait_on_events(struct kfd_process *p,
                     goto out_unlock;
             }

-           ret = init_event_waiter(p, &event_waiters[i],
-                                   event_data.event_id);
+           ret = init_event_waiter(p, &event_waiters[i], &event_data);
             if (ret)
                     goto out_unlock;
     }



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux