On 29/01/2022 04:43, Guangguan Wang wrote: > > On 2022/1/25 17:42, Stefan Raspl wrote: >> >> That's some truly substantial improvements! >> But we need to be careful with protocol-level changes: There are other operating systems like z/OS and AIX which have compatible implementations of SMC, too. Changes like a reduction of connections per link group or usage of reserved fields would need to be coordinated, and likely would have unwanted side-effects even when used with older Linux kernel versions. >> Changing the protocol is "expensive" insofar as it requires time to thoroughly discuss the changes, perform compatibility tests, and so on. >> So I would like to urge you to investigate alternative ways that do not require protocol-level changes to address this scenario, e.g. by modifying the number of completion queue elements, to see if this could yield similar results. >> >> Thx! >> > > Yes, there are alternative ways, as RNR caused by the missmatch of send rate and receive rate, which means sending too fast > or receiving too slow. What I have done in this patch is to backpressure the sending side when sending too fast. > > Another solution is to process and refill the receive queue as quickly as posibble, which requires no protocol-level change. > The fllowing modifications are needed: > - Enqueue cdc msgs to backlog queues instead of processing in rx tasklet. llc msgs remain unchanged. > - A mempool is needed as cdc msgs are processed asynchronously. Allocate new receive buffers from mempool when refill receive queue. > - Schedule backlog queues to other cpus, which are calculated by 4-tuple or 5-tuple hash of the connections, to process the cdc msgs, > in order to reduce the usage of the cpu where rx tasklet runs on. > > the pseudocode shows below: > rx_tasklet > if cdc_msgs > enqueue to backlog; > maybe smp_call_function_single_async is needed to wakeup the corresponding cpu to process backlog; > allocate new buffer and modify the sge in rq_wr; > else > process remains unchanged; > endif > > post_recv rq_wr; > end rx_tasklet > > smp_backlog_process in corresponding cpu, called by smp_call_function_single_async > for connections hashed to this cpu > for cdc_msgs in backlog > process cdc msgs; > end cdc_msgs > end connections > end smp_backlog_process > > I‘d like to hear your suggestions of this solution. > Thank you. I like this idea, this should improve the RX handling a lot!