Re: [PATCH 1/1] Revert "RDMA/rxe: Add workqueue support for rxe tasks"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



在 2023/9/27 4:24, Bart Van Assche 写道:
On 9/26/23 11:34, Bob Pearson wrote:
I am working to try to reproduce the KASAN warning. Unfortunately,
so far I am not able to see it in Ubuntu + Linus' kernel (as you described) on metal. The config file is different but copies the CONFIG_KASAN_xxx exactly as yours. With KASAN enabled it hangs on every iteration of srp/002 but without a KASAN warning. I am now building an openSuSE VM for qemu and will see if that causes the warning.

Hi Bob,

Did you try to understand the report that I shared? My conclusion from
the report is that when using tasklets rxe_completer() only runs after
rxe_requester() has finished and also that when using work queues that
rxe_completer() may run concurrently with rxe_requester(). This patch
seems to fix all issues that I ran into with the rdma_rxe workqueue
patch (I have not tried to verify the performance implications of this
patch):

diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
index 1501120d4f52..6cd5d5a7a316 100644
--- a/drivers/infiniband/sw/rxe/rxe_task.c
+++ b/drivers/infiniband/sw/rxe/rxe_task.c
@@ -10,7 +10,7 @@ static struct workqueue_struct *rxe_wq;

  int rxe_alloc_wq(void)
  {
-       rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, WQ_MAX_ACTIVE);
+       rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, 1);
         if (!rxe_wq)
                 return -ENOMEM;

Hi, Bart

With the above commit, I still found a similar problem. But the problem occurs very rarely. With the following, to now, the problem does not occur.

diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
index 1501120d4f52..3189c3705295 100644
--- a/drivers/infiniband/sw/rxe/rxe_task.c
+++ b/drivers/infiniband/sw/rxe/rxe_task.c
@@ -10,7 +10,7 @@ static struct workqueue_struct *rxe_wq;

 int rxe_alloc_wq(void)
 {
-       rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, WQ_MAX_ACTIVE);
+       rxe_wq = alloc_workqueue("rxe_wq", WQ_HIGHPRI | WQ_UNBOUND, 1);
        if (!rxe_wq)
                return -ENOMEM;


And with the tasklet, this problem also does not occur.

With "alloc_workqueue("rxe_wq", WQ_HIGHPRI | WQ_UNBOUND, 1);", an ordered workqueue with high priority is allocated.

To the same number of work item, the ordered workqueue has the same runing time with the tasklet. But the tasklet is based on softirq. Its overhead on scheduling is less than workqueue. So in theory, tasklet's performance should be better than the ordered workqueue.

Best Regards,
Zhu Yanjun


Thanks,

Bart.




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux