I believe that there is a serious bug in drivers/infiniband/core/cm.c that can cause kernel memory corruption. The problem arises from the interaction between cm_remove_one() and this code in cm_migrate(): https://code.woboq.org/linux/linux/drivers/infiniband/core/cm.c.html#3968 As I understand it, this code swaps the values of prim_send_port_not_ready and altr_send_port_not_ready when we fail over to the alternate path. In cm_remove_one(), prim_send_port_not_ready and altr_send_port_not_ready are set to true for every cm_id_priv associated with the client being removed: https://code.woboq.org/linux/linux/drivers/infiniband/core/cm.c.html#4478 I see two problems here: first, the two functions hold different locks. This means that we can lose a write to prim_send_port_not_ready or altr_send_port_not_ready if it happens concurrently with cm_migrate(). Second, while we swap the values of prim_send_port_not_ready and altr_send_port_not_ready, the associated cm_priv_prim_list and cm_priv_altr_list lists are *not* swapped. I believe that either bug can cause a subsequent call to cm_destroy_id() to write to the cm_device that cm_remove_one() has already freed, potentially corrupting kernel memory. The issue is the two list_del() calls here: https://code.woboq.org/linux/linux/drivers/infiniband/core/cm.c.html#1103 If we lose the race with cm_migrate(), then then the list_del() call can happen when it should not have. Alternatively, if cm_migrate() has been invoked an odd number of times for the cm_id_priv and only one of the prim port/altr port has been set to not ready, then list_del() will be called on the wrong list. I'm not very familiar with this code so I'm unsure of the best way to fix it. One approach would be: 1) Hold cm.lock in cm_migrate() in addition to the cm_id_priv's lock, and 2) Also swap the contents of the two lists in cm_migrate() If anybody could give input on my analysis and the proposed solution I'd really appreciate it. Thanks, Ryan