On 2/25/25 15:13, Hangbin Liu wrote: > On Tue, Feb 25, 2025 at 01:05:24PM +0200, Nikolay Aleksandrov wrote: >>> @@ -592,15 +611,17 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs) >>> real_dev->xfrmdev_ops->xdo_dev_state_delete(xs); >>> out: >>> netdev_put(real_dev, &tracker); >>> - mutex_lock(&bond->ipsec_lock); >>> - list_for_each_entry(ipsec, &bond->ipsec_list, list) { >>> - if (ipsec->xs == xs) { >>> - list_del(&ipsec->list); >>> - kfree(ipsec); >>> - break; >>> - } >>> - } >>> - mutex_unlock(&bond->ipsec_lock); >>> + >>> + xfrm_work = kmalloc(sizeof(*xfrm_work), GFP_ATOMIC); >>> + if (!xfrm_work) >>> + return; >>> + >> >> What happens if this allocation fails? I think you'll leak memory and >> potentially call the xdo_dev callbacks for this xs again because it's >> still in the list. Also this xfrm_work memory doesn't get freed anywhere, so >> you're leaking it as well. > > Yes, I thought this too simply and forgot free the memory. >> >> Perhaps you can do this allocation in add_sa, it seems you can sleep >> there and potentially return an error if it fails, so this can never >> fail later. You'll have to be careful with the freeing dance though. > > Hmm, if we allocation this in add_sa, how to we get the xfrm_work > in del_sa? Add the xfrm_work to another list will need to sleep again > to find it out in del_sa. > Well, you have struct bond_ipsec and it is tied with the work's lifetime so you can stick it there. :) I haven't looked closely how feasible it is. >> Alternatively, make the work a part of struct bond so it doesn't need >> memory management, but then you need a mechanism to queue these items (e.g. >> a separate list with a spinlock) and would have more complexity with freeing >> in parallel. > > I used a dealy work queue in bond for my draft patch. As you said, > it need another list to queue the xs. And during the gc works, we need > to use spinlock again to get the xs out... > Correct, it's a different kind of mess. :) >> >>> + INIT_WORK(&xfrm_work->work, bond_xfrm_state_gc_work); >>> + xfrm_work->bond = bond; >>> + xfrm_work->xs = xs; >>> + xfrm_state_hold(xs); >>> + >>> + queue_work(bond->wq, &xfrm_work->work); >> >> Note that nothing waits for this work anywhere and .ndo_uninit runs before >> bond's .priv_destructor which means ipsec_lock will be destroyed and will be >> used afterwards when destroying bond->wq from the destructor if there were >> any queued works. > > Do you mean we need to register the work queue in bond_init and cancel > it in bond_work_cancel_all()? > > Thanks > Hangbin That is one way, the other is if you have access to the work queue items then you can cancel them which should be easier (i.e. cancel_delayed_work_sync). Regardless of which way you choose to solve this (gc or work in bond_ipsec), there will be some dance to be done for the sequence of events that will not be straight-forward. Cheers, Nik