On Wed, Mar 05, 2025 at 10:38:36AM +0200, Nikolay Aleksandrov wrote: > > @@ -617,8 +614,18 @@ static void bond_ipsec_del_sa_all(struct bonding *bond) > > > > mutex_lock(&bond->ipsec_lock); > > list_for_each_entry(ipsec, &bond->ipsec_list, list) { > > Second time - you should use list_for_each_entry_safe if you're walking and deleting > elements from the list. Sorry, I missed this comment. I will update in next version. > > > + spin_lock_bh(&ipsec->xs->lock); > > if (!ipsec->xs->xso.real_dev) > > - continue; > > + goto next; > > + > > + if (ipsec->xs->km.state == XFRM_STATE_DEAD) { > > + /* already dead no need to delete again */ > > + if (real_dev->xfrmdev_ops->xdo_dev_state_free) > > + real_dev->xfrmdev_ops->xdo_dev_state_free(ipsec->xs); > > Have you checked if .xdo_dev_state_free can sleep? > I see at least one that can: mlx5e_xfrm_free_state(). Hmm, This brings us back to the initial problem. We tried to avoid calling a spin lock in a sleep context (bond_ipsec_del_sa), but now the new code encounters this issue again. With your reply, I also checked the xdo_dev_state_add() in bond_ipsec_add_sa_all(), which may also sleep, e.g. mlx5e_xfrm_add_state(), If we unlock the spin lock, then the race came back again. Any idea about this? thanks Hangbin