On Fri, Mar 25, 2016 at 2:35 PM, Nikolay Borisov <kernel@xxxxxxxx> wrote: [..] > And having kernel.hung_task_panic sysctl set to 1 caused a lot of > machines to reboot. In any case I don't think it's normal to have hung > tasks when your network is out. This happens due to the > wait_for_completion(&cm_id_priv->comp); never returning in cm_destroy_id > function. I saw there is one place where the cm_id refcount is > decremented via normal atomic_dec and not cm_deref_id under > cm_req_handle's rejected label. I dunno if this is correct or now, but > there definitely seems to be some refcounting problem. You didn't specified your kernel version, please do so. Also, do you have some known point in time (== kernel version) where it worked vs the current situation? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html