On 2022/10/14 8:05, Max Gurtovoy wrote:
Sorry for late response, we have holiday's in my country.
I still can't understand how this patch fixes your problem if you use ConnectX-5 since we use adaptive re-transmission by default and it's faster than 256msec to re-transmit.
adaptive re-transmission? Do you mean NAK-triggered retransmission?
NAK-triggered retransmission is very fast, but timeout-triggered retransmission
is very slow. Because There is a possibility that all packets of a QP are lost,
receiver HBA can not send NAK.
From our analysis, we didn't see any other adaptive re-transmission.
If there is any other adaptive re-transmission, can you explain it?
This patch modify the waiting time for timeout re-transmission, Thus if all packets
of a QP are lost, the re-transmission waiting time will become short.
Did you disable it ?
We do not disable anything.
I'll try to re-spin it internally again.
If you need more information, please feel free to contact me.
Thank you.
On 10/10/2022 12:12 PM, Chao Leng wrote:
Hi, Max
Can you give some comment? Thank you.
On 2022/8/29 21:15, Chao Leng wrote:
On 2022/8/29 17:06, Sagi Grimberg wrote:
If so, which devices did you use ?
The host HBA is Mellanox Technologies MT27800 Family [ConnectX-5];
The switch and storage are huawei equipments.
In principle, switches and storage devices from other vendors
have the same problem.
If you think it is necessary, we can test the other vendor switchs
and linux target.
Why is the 2s default chosen, what is the downside for a 250ms seconds ack timeout? and why is nvme-rdma different than all other kernel rdma
The downside is redundant retransmit if the packets delay more than
250ms in the networks and finally reaches the receiver.
Only in extreme scenarios, the packet delay may exceed 250 ms.
Sounds like the default needs to be changed if it only addresses the
extreme scenarios...
consumers that it needs to set this explicitly?
The real-time transaction services are sensitive to the delay.
nvme-rdma will be used in real-time transactions.
The real-time transaction services do not allow that the packets
delay more than 250ms in the networks.
So we need to set the ack timeout to 262ms.
While I don't disagree with the change itself, I do disagree why this
needs to be driven by nvme-rdma locally. If all kernel rdma consumers
need this (and if not, I'd like to understand why), this needs to be set in the rdma core.Changing the default set in the rdma core is another option.
But it will affect all application based on RDMA.
Max, what do you think? Thank you.
.
.
.