On Wed, 3 Jul 2024 15:01:07 +0200 Florian Westphal <fw@xxxxxxxxx> > Hillf Danton <hdanton@xxxxxxxx> wrote: > > On Wed, 3 Jul 2024 12:52:15 +0200 Florian Westphal <fw@xxxxxxxxx> > > > Hillf Danton <hdanton@xxxxxxxx> wrote: > > > > Given trans->table goes thru the lifespan of trans, your proposal is a bandaid > > > > if trans outlives table. > > > > > > trans must never outlive table. > > > > > What is preventing trans from being freed after closing sock, given > > trans is freed in workqueue? > > > > close sock > > queue work > > The notifier acquires the transaction mutex, locking out all other > transactions, so no further transactions requests referencing > the table can be queued. > As per the syzbot report, trans->table could be instantiated before notifier acquires the transaction mutex. And in fact the lock helps trans outlive table even with your patch. cpu1 cpu2 --- --- transB->table = A lock trans mutex flush work free A unlock trans mutex queue work to free transB > The work queue is flushed before potentially ripping the table > out. After this, no transactions referencing the table can exist > anymore; the only transactions than can still be queued are those > coming from a different netns, and tables are scoped per netns. > > Table is torn down. Transaction mutex is released. > > Next transaction from userspace can't find the table anymore (its gone), > so no more transactions can be queued for this table. > > As I wrote in the commit message, the flush is dumb, this should first > walk to see if there is a matching table to be torn down, and then flush > work queue once before tearing the table down. > > But its better to clearly split bug fix and such a change.