On Wed, Sep 27, 2023 at 4:37 AM Bart Van Assche <bvanassche@xxxxxxx> wrote: > > On 9/26/23 11:34, Bob Pearson wrote: > > I am working to try to reproduce the KASAN warning. Unfortunately, > > so far I am not able to see it in Ubuntu + Linus' kernel (as you > > described) on metal. The config file is different but copies the > > CONFIG_KASAN_xxx exactly as yours. With KASAN enabled it hangs on > > every iteration of srp/002 but without a KASAN warning. I am now > > building an openSuSE VM for qemu and will see if that causes the > > warning. > > Hi Bob, > > Did you try to understand the report that I shared? My conclusion from > the report is that when using tasklets rxe_completer() only runs after > rxe_requester() has finished and also that when using work queues that > rxe_completer() may run concurrently with rxe_requester(). This patch > seems to fix all issues that I ran into with the rdma_rxe workqueue > patch (I have not tried to verify the performance implications of this > patch): In the same test environment in the link https://lore.kernel.org/all/4e7aac82-f006-aaa7-6769-d1c9691a0cec@xxxxxxxxx/T/#m3294d00f5cf3247dfdb2ea3688b1467167f72704, RXE with workqueue has worse performance than RXE with tasklet. Sometimes RXE with workqueue can not work well. Need this commit in RXE. > > diff --git a/drivers/infiniband/sw/rxe/rxe_task.c > b/drivers/infiniband/sw/rxe/rxe_task.c > index 1501120d4f52..6cd5d5a7a316 100644 > --- a/drivers/infiniband/sw/rxe/rxe_task.c > +++ b/drivers/infiniband/sw/rxe/rxe_task.c > @@ -10,7 +10,7 @@ static struct workqueue_struct *rxe_wq; > > int rxe_alloc_wq(void) > { > - rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, WQ_MAX_ACTIVE); > + rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, 1); > if (!rxe_wq) > return -ENOMEM; > > Thanks, > > Bart.