Re: [netfilter-core] [sparc64] possible circular locking / deadlock

Florian Westphal <fw@xxxxxxxxx> · Mon, 17 Jun 2019 22:11:04 +0200

Jozsef Kadlecsik <kadlec@xxxxxxxxxxxxxxxxx> wrote:
> >                -> #0 (&table[i].mutex){+.+.}:
> > [   11.698157]        lock_acquire+0x1a4/0x1c0
> > [   11.698165]        __mutex_lock+0x48/0x920
> > [   11.698173]        mutex_lock_nested+0x1c/0x40
> > [   11.698181]        nfnl_lock+0x24/0x40 [nfnetlink]
> > [   11.698196]        ip_set_nfnl_get_byindex+0x19c/0x280 [ip_set]
> > [   11.698207]        set_match_v1_checkentry+0x14/0xc0 [xt_set]
> 
> set_match_v1_checkentry() from ipset always assumed that it's called via 
> the old xtables/setsockopt interface. Thus it calls 
> ip_set_nfnl_get_byindex() which is then calls 
> nfnl_lock(NFNL_SUBSYS_IPSET). Here comes the circular dependency.

But isnt it a false positive?

> > [   11.698359]        CPU0                    CPU1
> > [   11.698366]        ----                    ----
> > [   11.698372]   lock(&net->nft.commit_mutex);
> > [   11.698381]                                lock(&table[i].mutex);
> > [   11.698390]                                lock(&net->nft.commit_mutex);
> > [   11.698400]   lock(&table[i].mutex);
> > [   11.698408]

AFAICS CPU0 takes the ipset subsys mutex after taking the nftables
transaction mutex (via checkentry of ipset match), while CPU1 took the
nftables subsys mutex and then the nftables transaction mutex.

The only reason why this splat is generated is because nftables and
ipset subset mutexes are currently the same from lockdep pov.

It looks like we need to extend nfnetlink to place the subsystem mutexes
in different lockdep classes.