On 2019/9/23 13:01, Leon Romanovsky wrote: > On Fri, Sep 20, 2019 at 11:55:56AM +0800, Liuyixian (Eason) wrote: >> >> >> On 2019/9/11 21:17, Liuyixian (Eason) wrote: >>> >>> >>> On 2019/9/10 15:52, Leon Romanovsky wrote: >>>> On Tue, Sep 10, 2019 at 02:40:20PM +0800, Liuyixian (Eason) wrote: >>>>> >>>>> >>>>> On 2019/9/8 16:03, Leon Romanovsky wrote: >>>>>> On Thu, Sep 05, 2019 at 08:31:11PM +0800, Weihang Li wrote: >>>>>>> From: Yixian Liu <liuyixian@xxxxxxxxxx> >>>>>>> >>>>>>> Hip08 has the feature flush cqe, which help to flush wqe in workqueue >>>>>>> (sq and rq) when error happened by transmitting producer index with >>>>>>> mailbox to hardware. Flush cqe is emplemented in post send and recv >>>>>>> verbs. However, under NVMe cases, these verbs will be called under >>>>>>> softirq context, and it will lead to following calltrace with >>>>>>> current driver as mailbox used by flush cqe can go to sleep. >>>>>>> >>>>>>> This patch solves this problem by using workqueue to do flush cqe, >>>>>> >>>>>> Unbelievable, almost every bug in this driver is solved by introducing >>>>>> workqueue. You should fix "sleep in flush path" issue and not by adding >>>>>> new workqueue. >>>>>> >>>>> Hi Leon, >>>>> >>>>> Thanks for the comment. >>>>> Up to now, for hip08, only one place use workqueue in hns_roce_hw_v2.c >>>>> where for irq prints. >>>> >>>> Thanks to our lack of desire to add more workqueues and previous patches >>>> which removed extra workqueues from the driver. >>>> >>> Thanks, I see. >>> >>>>> >>>>> The solution for flush cqe in this patch is as follow: >>>>> While flush cqe should be implement, the driver should modify qp to error state >>>>> through mailbox with the newest product index of sq and rq, the hardware then >>>>> can flush all outstanding wqes in sq and rq. >>>>> >>>>> That's the whole mechanism of flush cqe, also is the flush path. We can't >>>>> change neither mailbox sleep attribute or flush cqe occurred in post send/recv. >>>>> To avoid the calltrace of flush cqe in post verbs under NVMe softirq, >>>>> use workqueue for flush cqe seems reasonable. >>>>> >>>>> As far as I know, there is no other alternative solution for this situation. >>>>> I will be very grateful if you reminder me more information. >>>> >>>> ib_drain_rq/ib_drain_sq/ib_drain_qp???? >>>> >>> Hi Leon, >>> >>> I think these interfaces are designed for application to check that all wqes >>> have been processed by hardware, so called drain or flush. However, it is not >>> the same as the flush in this patch. The solution in this patch is used >>> to help the hardware generate flush cqes for outstanding wqes while qp error. >>> >> Hi Leon, >> >> What's your opinion about above? Do you have any further comments? > > My opinion didn't change, you need to read discussions about ib_drain_*() > functions, how and why they were introduced. It is a way to go. > > Thanks Hi Leon, Thanks a lot! I will dig those functions for my problem. > >> >> Thanks. >> >>>>> >>>>> Thanks >>>>> >>>>>> _______________________________________________ >>>>>> Linuxarm mailing list >>>>>> Linuxarm@xxxxxxxxxx >>>>>> http://hulk.huawei.com/mailman/listinfo/linuxarm >>>>>> >>>>>> >>>>> >>>> >>>> . >>>> >>> >>> _______________________________________________ >>> Linuxarm mailing list >>> Linuxarm@xxxxxxxxxx >>> http://hulk.huawei.com/mailman/listinfo/linuxarm >>> >>> . >>> >> > > . >