On Thu, Nov 24, 2022 at 04:19:35PM +0800, Chao Leng wrote: > > > On 2022/11/24 3:48, Jason Gunthorpe wrote: > > On Wed, Nov 23, 2022 at 10:13:48AM +0800, Chao Leng wrote: > > > > > > > > > On 2022/11/22 22:08, Jason Gunthorpe wrote: > > > > On Tue, Nov 22, 2022 at 05:02:06PM +0800, Chao Leng wrote: > > > > > Now the default packet lifetime(CMA_IBOE_PACKET_LIFETIME) is 18. > > > > > That means the minimum ack timeout is 2 seconds(2^(18+1)*4us=2.097seconds). > > > > > The packet lifetime means the maximum transmission time of packets > > > > > on the network, the maximum transmission time of packets is closely > > > > > related to the network. 2 seconds is too long for simple lossless networks. > > > > > The packet lifetime should allow the user to adjust according to the > > > > > network situation. > > > > > So add a parameter for the packet lifetime. > > > > > > > > > > Signed-off-by: Chao Leng <lengchao@xxxxxxxxxx> > > > > > --- > > > > > drivers/infiniband/core/cma.c | 6 +++++- > > > > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c > > > > > index cc2222b85c88..8e2ff5d610e3 100644 > > > > > --- a/drivers/infiniband/core/cma.c > > > > > +++ b/drivers/infiniband/core/cma.c > > > > > @@ -50,6 +50,10 @@ MODULE_LICENSE("Dual BSD/GPL"); > > > > > #define CMA_IBOE_PACKET_LIFETIME 18 > > > > > #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP > > > > > +static unsigned char cma_packet_lifetime = CMA_IBOE_PACKET_LIFETIME; > > > > > +module_param_named(packet_lifetime, cma_packet_lifetime, byte, 0644); > > > > > +MODULE_PARM_DESC(packet_lifetime, "max transmission time of the packet"); > > > > > > > > No new module parameters > > > > > > > > Maybe something in netlink would be appropriate, I'm not sure how > > > > best to deal with this. > > > > > > > > Really, the entire retransmit strategy in CM is not suitable for > > > > ethernet networks, this is just one symptom. > > > What do you think to change the CMA_IBOE_PACKET_LIFETIME to 16. > > > The maximum transmission time of packets will be about 500+ms, > > > I think this is long enough for RoCE networks. > > > 2 seconds is too long to my honest. > > > > I don't have an informed opinion on this. I agree that 2s is too long though > > > > Do we have any information to back up what this should be? > Assume the network is a clos topology with three layers, every packet > will pass through five hops of switches. Assume the buffer of every > switch is 128MB and the port transmission rate is 25 Gbit/s, > the maximum transmission time of the packet is 200ms(128MB*5/25Gbit/s). > Add double redundancy, it is less than 500ms. We also have to worry about HCA processing time which is driven by CPU loading more than anything > So change the CMA_IBOE_PACKET_LIFETIME to 16, > the maximum transmission time of the packet will be about 500+ms, > it is long enough. That makes sense to me, put it in a commit message and send a patch Jason