On 11/07/2024 12:59, Dan Aloni wrote:
We observed a scenario in IB bonding where RDMA_CM_EVENT_ADDR_CHANGE is followed by RDMA_CM_EVENT_DISCONNECTED on a connected endpoint. This sequence causes a negative reference splat and subsequent tear-down issues due to a duplication in the disconnection path. This fix aligns with the approach taken in a previous change 4836da219781 ("rpcrdma: fix handling for RDMA_CM_EVENT_DEVICE_REMOVAL"), addressing a similar issue.
I think a code comment will help here. This whole handler is not very intuitive (but that may be a result of the rdma_cm state machine, the picture in other ulps do not
look materially different).