On Thu, Mar 06, 2025 at 09:37:53AM +0000, Hangbin Liu wrote: > > > > The reason the mutex was added (instead of the spinlock used before) > > was exactly because the add and free offload operations could sleep. > > > > > With your reply, I also checked the xdo_dev_state_add() in > > > bond_ipsec_add_sa_all(), which may also sleep, e.g. > > > mlx5e_xfrm_add_state(), > > > > > > If we unlock the spin lock, then the race came back again. > > > > > > Any idea about this? > > > > The race is between bond_ipsec_del_sa_all and bond_ipsec_del_sa (plus > > bond_ipsec_free_sa). The issue is that when bond_ipsec_del_sa_all > > releases x->lock, bond_ipsec_del_sa can immediately be called, followed > > by bond_ipsec_free_sa. > > Maybe dropping x->lock after setting real_dev to NULL? I checked, > > real_dev is not used anywhere on the free calls, I think. I have > > another series refactoring things around real_dev, I hope to be able to > > send it soon. > > > > Here's a sketch of this idea: > > > > --- a/drivers/net/bonding/bond_main.c > > +++ b/drivers/net/bonding/bond_main.c > > @@ -613,8 +613,11 @@ static void bond_ipsec_del_sa_all(struct bonding > > *bond) > > > > mutex_lock(&bond->ipsec_lock); > > list_for_each_entry(ipsec, &bond->ipsec_list, list) { > > - if (!ipsec->xs->xso.real_dev) > > + spin_lock(&ipsec->x->lock); > > + if (!ipsec->xs->xso.real_dev) { > > + spin_unlock(&ipsec->x->lock); > > continue; > > + } > > > > if (!real_dev->xfrmdev_ops || > > !real_dev->xfrmdev_ops->xdo_dev_state_delete || > > @@ -622,12 +625,16 @@ static void bond_ipsec_del_sa_all(struct bonding > > *bond) > > slave_warn(bond_dev, real_dev, > > "%s: no slave > > xdo_dev_state_delete\n", > > __func__); > > - } else { > > - real_dev->xfrmdev_ops- > > >xdo_dev_state_delete(real_dev, ipsec->xs); > > - if (real_dev->xfrmdev_ops->xdo_dev_state_free) > > - real_dev->xfrmdev_ops- > > >xdo_dev_state_free(ipsec->xs); > > - ipsec->xs->xso.real_dev = NULL; > > + spin_unlock(&ipsec->x->lock); > > + continue; > > } > > + > > + real_dev->xfrmdev_ops->xdo_dev_state_delete(real_dev, > > ipsec->xs); > > + ipsec->xs->xso.real_dev = NULL; > > Set xs->xso.real_dev = NULL is a good idea. As we will break > in bond_ipsec_del_sa()/bond_ipsec_free_sa() when there is no > xs->xso.real_dev. > > For bond_ipsec_add_sa_all(), I will move the xso.real_dev = real_dev > after .xdo_dev_state_add() in case the following situation. > > bond_ipsec_add_sa_all() > spin_unlock(&ipsec->x->lock); > ipsec->xs->xso.real_dev = real_dev; > __xfrm_state_delete x->state = DEAD > - bond_ipsec_del_sa() > - .xdo_dev_state_delete() > .xdo_dev_state_add() Hmm, do we still need to the spin_lock in bond_ipsec_add_sa_all()? With xs->xso.real_dev = NULL after bond_ipsec_del_sa_all(), it looks there is no need the spin_lock in bond_ipsec_add_sa_all(). e.g. diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 04b677d0c45b..3ada51c63207 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -537,15 +537,27 @@ static void bond_ipsec_add_sa_all(struct bonding *bond) } list_for_each_entry(ipsec, &bond->ipsec_list, list) { + spin_lock_bh(&ipsec->xs->lock); + /* Skip dead xfrm states, they'll be freed later. */ + if (ipsec->xs->km.state == XFRM_STATE_DEAD) { + spin_unlock_bh(&ipsec->xs->lock); + continue; + } + /* If new state is added before ipsec_lock acquired */ - if (ipsec->xs->xso.real_dev == real_dev) + if (ipsec->xs->xso.real_dev == real_dev) { + spin_unlock_bh(&ipsec->xs->lock); continue; + } - ipsec->xs->xso.real_dev = real_dev; if (real_dev->xfrmdev_ops->xdo_dev_state_add(ipsec->xs, NULL)) { slave_warn(bond_dev, real_dev, "%s: failed to add SA\n", __func__); ipsec->xs->xso.real_dev = NULL; } + /* Set real_dev after .xdo_dev_state_add in case + * __xfrm_state_delete() is called in parallel + */ + ipsec->xs->xso.real_dev = real_dev; } The spin_lock here seems useless now. What do you think? Thanks Hangbin