Dear Bob, Yanjun, and Bart, Sorry for taking long time to reply. > On 9/12/22 02:58, matsuda-daisuke@xxxxxxxxxxx wrote: > > On Mon, Sep 12, 2022 12:09 AM Bart Van Assche wrote: > >> On 9/11/22 00:10, Yanjun Zhu wrote: > >>> I also implemented a workqueue for rxe. IMO, can we add a variable to > >>> decide to use tasklet or workqueue? > >>> > >>> If user prefer using tasklet, he can set the variable to use > >>> tasklet. And the default is tasklet. Set the variable to another > >>> value to use workqueue. > > > > That's an interesting idea, but I am not sure how users specify it. > > IIRC, tasklets are generated when rdma link is added, typically by > > executing ' rdma link add' command. I don't think we can add > > an device specific option to the utility(iproute2/rdma). > > > >> > >> I'm in favor of removing all uses of the tasklet mechanism because of > >> the disadvantages of that mechanism. See also: > >> * "Eliminating tasklets" (https://lwn.net/Articles/239633/). > >> * "Modernizing the tasklet API" (https://lwn.net/Articles/830964/). > >> * Sebastian Andrzej Siewior's opinion about tasklets > >> (https://lore.kernel.org/all/YvovfXMJQAUBsvBZ@xxxxxxxxxxxxx/). > > > > I am also in favor of using workqueues alone not only because of the > > disadvantages above but also to avoid complexity. I would like to know > > if there is anybody who will bothered by the change especially in terms > > of performance. > > > > Thanks, > > Daisuke > > > >> > >> Thanks, > >> > >> Bart. > > > > The HPE patch set for work queues (I should send it in) kept the ability to run tasklets or work queues. > The reason was that the change was motivated by a benchmarking exercise and we found that the performance > of tasklets was noticeably better for one IO stream but for 2 or more IO streams work queues were better because we > could place the work on separate cpus. Tasklets have a tendency to bunch up on the same cpu. I am interested in > how Matsuda got better/same performance for work queues. As far as I measured the bandwidth using ib_send_bw command, the performance was better with workqueues. There seem to be multiple factors that affect the result. For example, with the current implementation, rxe_responder() can sometimes be called from softirq(NET_RX_SOFTIRQ) context directly. I changed rxe_resp_queue_pkt() to always schedule the responder for all incoming requests. This may have led to better utilization of multiple processors because softirq code and responder code are more likely to run concurrently on different cores in this case. Tasklets are likely to run on the same core as softirq code because TASKLET_SOFTIRQ is processed later than NET_RX_SOFTIRQ in __do_softirq(). That being said, I think it is also true that the performance of tasklets can be superior to that of workqueues in some situations. When I measured the bandwidth of RDMA Read using ib_read_bw command, it was better with tasklets. Additionally, the latency is generally around 40% higher with workqueues, so it is possible some kinds of workloads do not benefit from using workqueues. I therefore think we may preserve tasklets though there are disadvantages suggested by Bart. While I have no objection to removing tasklets totally, I am in favour of Yanjun's suggestion of switching between tasklets and workqueues using sysctl parameter. I would like to hear what you guys think about this. I would also like to know when Bob is going to post the patchset. Both of us need to use workqueues, but allowing sleep for ODP and improving performance for multiplex IO streams are different matters. I suppose it will be easier to make the changes one by one. If you need some more time to post it, I suggest we proceed with Yanjun's idea for now. That will preserve current implementation of tasklets, so it would be not hard to add your change onto it. Could you let me know your thoughts? Regards, Daisuke Matsuda > > Bob