On 9/26/23 11:34, Bob Pearson wrote:
I am working to try to reproduce the KASAN warning. Unfortunately,
so far I am not able to see it in Ubuntu + Linus' kernel (as you
described) on metal. The config file is different but copies the
CONFIG_KASAN_xxx exactly as yours. With KASAN enabled it hangs on
every iteration of srp/002 but without a KASAN warning. I am now
building an openSuSE VM for qemu and will see if that causes the
warning.
Hi Bob,
Did you try to understand the report that I shared? My conclusion from
the report is that when using tasklets rxe_completer() only runs after
rxe_requester() has finished and also that when using work queues that
rxe_completer() may run concurrently with rxe_requester(). This patch
seems to fix all issues that I ran into with the rdma_rxe workqueue
patch (I have not tried to verify the performance implications of this
patch):
diff --git a/drivers/infiniband/sw/rxe/rxe_task.c
b/drivers/infiniband/sw/rxe/rxe_task.c
index 1501120d4f52..6cd5d5a7a316 100644
--- a/drivers/infiniband/sw/rxe/rxe_task.c
+++ b/drivers/infiniband/sw/rxe/rxe_task.c
@@ -10,7 +10,7 @@ static struct workqueue_struct *rxe_wq;
int rxe_alloc_wq(void)
{
- rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, WQ_MAX_ACTIVE);
+ rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, 1);
if (!rxe_wq)
return -ENOMEM;
Thanks,
Bart.