On Fri, 2019-10-04 at 16:40 -0400, Dennis Dalessandro wrote: > From: Kaike Wan <kaike.wan@xxxxxxxxx> > > A TID RDMA READ request could be retried under one of the following > conditions: > - The RC retry timer expires; > - A later TID RDMA READ RESP packet is received before the next > expected one. > For the latter, under normal conditions, the PSN in IB space is used > for comparison. More specifically, the IB PSN in the incoming TID RDMA > READ RESP packet is compared with the last IB PSN of a given TID RDMA > READ request to determine if the request should be retried. This is > similar to the retry logic for noraml RDMA READ request. > > However, if a TID RDMA READ RESP packet is lost due to congestion, > header suppresion will be disabled and each incoming packet will raise > an interrupt until the hardware flow is reloaded. Under this > condition, > each packet KDETH PSN will be checked by software against r_next_psn > and a retry will be requested if the packet KDETH PSN is later than > r_next_psn. Since each TID RDMA READ segment could have up to 64 > packets and each TID RDMA READ request could have many segments, we > could make far more retries under such conditions, and thus leading to > RETRY_EXC_ERR status. > > This patch fixes the issue by removing the retry when the incoming > packet KDETH PSN is later than r_next_psn. Instead, it resorts to > RC timer and normal IB PSN comparison for any request retry. > > Fixes: 9905bf06e890 ("IB/hfi1: Add functions to receive TID RDMA READ > response") > Cc: <stable@xxxxxxxxxxxxxxx> > Reviewed-by: Mike Marciniszyn <mike.marciniszyn@xxxxxxxxx> > Signed-off-by: Kaike Wan <kaike.wan@xxxxxxxxx> > Signed-off-by: Dennis Dalessandro <dennis.dalessandro@xxxxxxxxx> Thanks, applied to for-rc. -- Doug Ledford <dledford@xxxxxxxxxx> GPG KeyID: B826A3330E572FDD Fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD
Attachment:
signature.asc
Description: This is a digitally signed message part