Hi Florian, I just looked at the lock event trace from our report and it looks like two distinct commit mutexes were held when the race was triggered. I think the race is probably on the table_handle variable on net/netfilter/nf_tables_api.c:1221, and not the table->handle field being written to. Racing increments to table_handle could cause it to either overcount or undercount. Could that be an issue? Best, Gabe On Fri, Aug 19, 2022 at 8:35 AM Florian Westphal <fw@xxxxxxxxx> wrote: > > Abhishek Shah <abhishek.shah@xxxxxxxxxxxx> wrote: > > Hi all, > > > > We found a race involving the table->handle variable here > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__elixir.bootlin.com_linux_v5.18-2Drc5_source_net_netfilter_nf-5Ftables-5Fapi.c-23L1221&d=DwIBAg&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=EyAJYRJu01oaAhhVVY3o8zKgZvacDAXd_PNRtaqACCo&m=xlZC-wDg7fkTm6_4HfcoDqYfJx_OU2L5HHX2q_yTYZZCEDCFAg-9I7T1gNmXPISg&s=JYkSOriQVx_3lJhAzBo7yqhe4bnf2Sy96cPL0L1NIn8&e= >. > > This race advances the pointer, which can cause out-of-bounds memory > > accesses in the future. Please let us know what you think. > > > > Thanks! > > > > > > *---------------------Report-----------------* > > *read-write* to 0xffffffff883a01e8 of 8 bytes by task 6542 on cpu 0: > > nf_tables_newtable+0x6dc/0xc00 net/netfilter/nf_tables_api.c:1221 > > nfnetlink_rcv_batch net/netfilter/nfnetlink.c:513 [inline] > > [..] > > > *read-write* to 0xffffffff883a01e8 of 8 bytes by task 6541 on cpu 1: > > nf_tables_newtable+0x6dc/0xc00 net/netfilter/nf_tables_api.c:1221 > > nfnetlink_rcv_batch net/netfilter/nfnetlink.c:513 [inline] > > [..] > > I don't understand. Like all batch operations, nf_tables_newtable is > supposed to run with the transaction mutex held, i.e. parallel execution > is not expected. > > There is a lockdep assertion at start of nf_tables_newtable(); I > don't see how its possible that two threads can run this concurrently. -- Gabriel Ryan PhD Candidate at Columbia University