Re: [PATCH 1/1] Revert "RDMA/rxe: Add workqueue support for rxe tasks"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




在 2023/10/4 8:46, Zhu Yanjun 写道:

在 2023/10/4 2:11, Leon Romanovsky 写道:
On Tue, Oct 03, 2023 at 11:29:42PM +0800, Zhu Yanjun wrote:
在 2023/10/3 17:59, Leon Romanovsky 写道:
On Tue, Oct 03, 2023 at 04:55:40PM +0800, Zhu Yanjun wrote:
在 2023/10/1 14:50, Leon Romanovsky 写道:
On Sun, Oct 1, 2023, at 09:47, Zhu Yanjun wrote:
在 2023/10/1 14:39, Leon Romanovsky 写道:
On Sun, Oct 1, 2023, at 09:34, Zhu Yanjun wrote:
在 2023/10/1 14:30, Leon Romanovsky 写道:
On Wed, Sep 27, 2023 at 11:51:12AM -0500, Bob Pearson wrote:
On 9/26/23 15:24, Bart Van Assche wrote:
On 9/26/23 11:34, Bob Pearson wrote:
I am working to try to reproduce the KASAN warning. Unfortunately, so far I am not able to see it in Ubuntu + Linus' kernel (as you described) on metal. The config file is different but copies the CONFIG_KASAN_xxx exactly as yours. With KASAN enabled it hangs on every iteration of srp/002 but without a KASAN warning. I am now building an openSuSE VM for qemu and will see if that causes the warning.
Hi Bob,

Did you try to understand the report that I shared? My conclusion from the report is that when using tasklets rxe_completer() only runs after rxe_requester() has finished and also that when using work queues that rxe_completer() may run concurrently with rxe_requester(). This patch seems to fix all issues that I ran into with the rdma_rxe workqueue patch (I have not tried to verify the performance implications of this
patch):

diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
index 1501120d4f52..6cd5d5a7a316 100644
--- a/drivers/infiniband/sw/rxe/rxe_task.c
+++ b/drivers/infiniband/sw/rxe/rxe_task.c
@@ -10,7 +10,7 @@ static struct workqueue_struct *rxe_wq;

      int rxe_alloc_wq(void)
      {
-       rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, WQ_MAX_ACTIVE);
+       rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, 1);
             if (!rxe_wq)
                     return -ENOMEM;
With this commit, a test run for several days. The similar problem still
occurred.

The problem is very similar with the one that Bart mentioned.

It is very possible that WQ_MAX_ACTIVE is changed to 1, then this problem is
alleviated.

In the following

4661 __printf(1, 4)
4662 struct workqueue_struct *alloc_workqueue(const char *fmt,
4663                                          unsigned int flags,
4664                                          int max_active, ...)
4665 {
4666         va_list args;
4667         struct workqueue_struct *wq;
4668         struct pool_workqueue *pwq;
4669
4670         /*
4671          * Unbound && max_active == 1 used to imply ordered, which is
no longer
4672          * the case on many machines due to per-pod pools. While
4673          * alloc_ordered_workqueue() is the right way to create an
ordered
4674          * workqueue, keep the previous behavior to avoid subtle
breakages.
4675          */
4676         if ((flags & WQ_UNBOUND) && max_active == 1)
<---This means that workqueue is ordered.
4677                 flags |= __WQ_ORDERED;
...

Do this mean that the ordered workqueue covers the root cause? When
workqueue is changed to ordered, it is difficult to reproduce this problem.

Got it.

Is there any way to ensure the following?

if a mail does not appear in the rdma maillist, this mail will not be reviewed?


Sorry. My bad. I used the wrong rdma maillist.




The analysis is as below:

Because workqueue will sleep when it is preempted, sometimes the sleep time
will exceed the timeout

of rdma packets. As such, rdma stack or ULP will oom or hang. This is why
workqueue will cause ULP hang.

But tasklet will not sleep. So this kind of problem will not occur with
tasklet.

About the performance, currently ordered workqueue can only execute at most
one work item at any given

time in the queued order. So in RXE, workqueue will not execute more jobs
than tasklet.
It is because of changing max_active to be 1. Once that bug will be
fixed, RXE will be able to spread traffic on all CPUs.


Sure. I agree with you.


After max_active is changed to 1, the workqueue is the ordered workqueue.

The ordered workqueue will execute the work item one by one on differen CPUs,

that is, after one work item is complete, the ordered workqueue will execute another one

in the queued order on different CPUs. Tasklet will execute the jobs in the same CPU one by one.

So if the total job number is the same, the ordered workqueue will have the same execution time with the tasklet.

But the ordered workqueue has more overhead in scheduling than the tasklet.

In total, the performance of the ordered workqueue is not good compared with the tasklet.

Zhu Yanjun



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux