On Wed, Sep 27, 2023 at 11:51:12AM -0500, Bob Pearson wrote: > On 9/26/23 15:24, Bart Van Assche wrote: > > On 9/26/23 11:34, Bob Pearson wrote: > >> I am working to try to reproduce the KASAN warning. Unfortunately, > >> so far I am not able to see it in Ubuntu + Linus' kernel (as you described) on metal. The config file is different but copies the CONFIG_KASAN_xxx exactly as yours. With KASAN enabled it hangs on every iteration of srp/002 but without a KASAN warning. I am now building an openSuSE VM for qemu and will see if that causes the warning. > > > > Hi Bob, > > > > Did you try to understand the report that I shared? My conclusion from > > the report is that when using tasklets rxe_completer() only runs after > > rxe_requester() has finished and also that when using work queues that > > rxe_completer() may run concurrently with rxe_requester(). This patch > > seems to fix all issues that I ran into with the rdma_rxe workqueue > > patch (I have not tried to verify the performance implications of this > > patch): > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c > > index 1501120d4f52..6cd5d5a7a316 100644 > > --- a/drivers/infiniband/sw/rxe/rxe_task.c > > +++ b/drivers/infiniband/sw/rxe/rxe_task.c > > @@ -10,7 +10,7 @@ static struct workqueue_struct *rxe_wq; > > > > int rxe_alloc_wq(void) > > { > > - rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, WQ_MAX_ACTIVE); > > + rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, 1); > > if (!rxe_wq) > > return -ENOMEM; > > > > Thanks, > > > > Bart. <...> > Nevertheless this is a good hint since it seems to imply that there is a race between the requester and > completer which is certainly possible. Bob, Bart Can you please send this change as a formal patch? As we prefer workqueue with bad performance implementation over tasklets. Thanks > > Bob > >