When master SM failover to the slave SM, the following sequence of events causes segfault in the ibacm. In our scenario, we are using Mellanox CX3 in the virtualized environment. 1. IBV_EVENT_CLIENT_REREGISTRATION 2. IBV_EVENT_GID_CHANGE During the client reregistration, a multicast join request was sent. Then, when handling the GID table change async event, the acm port is toggle. Due to the fact that the physical IB PORT is ACTIVE, the multicast join request can come back before the acm_port is set back to UP state. This causes acm_gid_index hits into a null pointer. Thus, this patch delays the handling of the MAD until the next event if there is no provider in the acm port. ibacm[39381]: segfault at bc ip 0000000000402e59 sp 00007fc77c267bf8 error 4 in ibacm[400000+c000] (gdb) bt prov/acmp/src/acmp.c:658 src/acm.c:2926 Signed-off-by: Wei Lin Guay <wei.lin.guay@xxxxxxxxxx> --- ibacm/src/acm.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/ibacm/src/acm.c b/ibacm/src/acm.c index 59385fb..38dc464 100644 --- a/ibacm/src/acm.c +++ b/ibacm/src/acm.c @@ -2856,6 +2856,12 @@ static void acmc_recv_mad(struct acmc_port *port) struct umad_hdr *hdr; acm_log(2, "\n"); + + if (!port->prov) { + acm_log(1, "no provider assigned to port\n"); + return; + } + len = sizeof(resp.sa_mad); ret = umad_recv(port->mad_portid, &resp.umad, &len, 0); if (ret < 0) { -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html