Hi, On Mon, 17 Jun 2019, Anatoly Pugachev wrote: > Getting the following git kernel trace on boot with rc.local having : > > ipset create sshguard4 hash:net > iptables -A INPUT -p tcp --dport 22 -m set --match-set sshguard4 src -j DROP In spite of "iptables", it must be the nftables compat backend. > current git kernel: > > $ uname -a > Linux ttip 5.2.0-rc5 #981 SMP Mon Jun 17 09:52:04 MSK 2019 sparc64 GNU/Linux > linux-2.6$ git desc > v5.2-rc5 > > > $ dmesg > <cut> > [ 10.356388] Adding 787176k swap on /dev/vdiska4. Priority:-2 > extents:1 across:787176k FS > [ 10.471900] EXT4-fs (vdiska1): mounting ext3 file system using the > ext4 subsystem > [ 10.487226] EXT4-fs (vdiska1): mounted filesystem with ordered data > mode. Opts: (null) > [ 11.158102] random: crng init done > [ 11.158155] random: 7 urandom warning(s) missed due to ratelimiting > > [ 11.697866] ====================================================== > [ 11.697875] WARNING: possible circular locking dependency detected > [ 11.697886] 5.2.0-rc5 #981 Not tainted > [ 11.697894] ------------------------------------------------------ > [ 11.697902] iptables/732 is trying to acquire lock: > [ 11.697913] 000000004f61aa56 (&table[i].mutex){+.+.}, at: > nfnl_lock+0x24/0x40 [nfnetlink] > [ 11.697937] > but task is already holding lock: > [ 11.697946] 000000000d652829 (&net->nft.commit_mutex){+.+.}, at: > nf_tables_valid_genid+0x18/0x60 [nf_tables] > [ 11.697973] > which lock already depends on the new lock. > > [ 11.697983] > the existing dependency chain (in reverse order) is: > [ 11.697992] > -> #1 (&net->nft.commit_mutex){+.+.}: > [ 11.698012] __mutex_lock+0x48/0x920 > [ 11.698021] mutex_lock_nested+0x1c/0x40 > [ 11.698033] nf_tables_valid_genid+0x18/0x60 [nf_tables] > [ 11.698043] nfnetlink_rcv_batch+0x24c/0x620 [nfnetlink] > [ 11.698053] nfnetlink_rcv+0x110/0x140 [nfnetlink] > [ 11.698067] netlink_unicast+0x12c/0x1e0 > [ 11.698076] netlink_sendmsg+0x324/0x360 > [ 11.698091] sock_sendmsg+0x34/0x80 > [ 11.698099] ___sys_sendmsg+0x228/0x240 > [ 11.698108] __sys_sendmsg+0x4c/0x80 > [ 11.698116] sys_sendmsg+0x18/0x40 > [ 11.698131] linux_sparc_syscall+0x34/0x44 > [ 11.698138] > -> #0 (&table[i].mutex){+.+.}: > [ 11.698157] lock_acquire+0x1a4/0x1c0 > [ 11.698165] __mutex_lock+0x48/0x920 > [ 11.698173] mutex_lock_nested+0x1c/0x40 > [ 11.698181] nfnl_lock+0x24/0x40 [nfnetlink] > [ 11.698196] ip_set_nfnl_get_byindex+0x19c/0x280 [ip_set] > [ 11.698207] set_match_v1_checkentry+0x14/0xc0 [xt_set] set_match_v1_checkentry() from ipset always assumed that it's called via the old xtables/setsockopt interface. Thus it calls ip_set_nfnl_get_byindex() which is then calls nfnl_lock(NFNL_SUBSYS_IPSET). Here comes the circular dependency. I suppose the only solution is to check wether the mutex is already held or not. Until I send the patch, the only way to avoid the issue is to use the old legacy xtables interface. Best regards, Jozsef > [ 11.698222] xt_check_match+0x238/0x260 [x_tables] > [ 11.698234] __nft_match_init+0x160/0x180 [nft_compat] > [ 11.698244] nft_match_init+0x18/0x40 [nft_compat] > [ 11.698256] nf_tables_newrule+0x57c/0x7a0 [nf_tables] > [ 11.698266] nfnetlink_rcv_batch+0x3f8/0x620 [nfnetlink] > [ 11.698275] nfnetlink_rcv+0x110/0x140 [nfnetlink] > [ 11.698284] netlink_unicast+0x12c/0x1e0 > [ 11.698292] netlink_sendmsg+0x324/0x360 > [ 11.698300] sock_sendmsg+0x34/0x80 > [ 11.698309] ___sys_sendmsg+0x228/0x240 > [ 11.698317] __sys_sendmsg+0x4c/0x80 > [ 11.698325] sys_sendmsg+0x18/0x40 > [ 11.698334] linux_sparc_syscall+0x34/0x44 > [ 11.698340] > other info that might help us debug this: > > [ 11.698351] Possible unsafe locking scenario: > > [ 11.698359] CPU0 CPU1 > [ 11.698366] ---- ---- > [ 11.698372] lock(&net->nft.commit_mutex); > [ 11.698381] lock(&table[i].mutex); > [ 11.698390] lock(&net->nft.commit_mutex); > [ 11.698400] lock(&table[i].mutex); > [ 11.698408] > *** DEADLOCK *** > > [ 11.698418] 1 lock held by iptables/732: > [ 11.698424] #0: 000000000d652829 (&net->nft.commit_mutex){+.+.}, > at: nf_tables_valid_genid+0x18/0x60 [nf_tables] > [ 11.698444] > stack backtrace: > [ 11.698454] CPU: 6 PID: 732 Comm: iptables Not tainted 5.2.0-rc5 #981 > [ 11.698463] Call Trace: > [ 11.698471] [00000000004cfde0] print_circular_bug+0x2e0/0x320 > [ 11.698480] [00000000004d4bd8] __lock_acquire+0x1d38/0x2900 > [ 11.698489] [00000000004d6084] lock_acquire+0x1a4/0x1c0 > [ 11.698498] [0000000000a06508] __mutex_lock+0x48/0x920 > [ 11.698506] [0000000000a06dfc] mutex_lock_nested+0x1c/0x40 > [ 11.698516] [000000001071c024] nfnl_lock+0x24/0x40 [nfnetlink] > [ 11.698527] [00000000107568dc] ip_set_nfnl_get_byindex+0x19c/0x280 [ip_set] > [ 11.698537] [000000001078e5d4] set_match_v1_checkentry+0x14/0xc0 [xt_set] > [ 11.698549] [0000000010310ed8] xt_check_match+0x238/0x260 [x_tables] > [ 11.698559] [000000001077cc00] __nft_match_init+0x160/0x180 [nft_compat] > [ 11.698569] [000000001077ccb8] nft_match_init+0x18/0x40 [nft_compat] > [ 11.698582] [0000000010731c3c] nf_tables_newrule+0x57c/0x7a0 [nf_tables] > [ 11.698592] [000000001071d238] nfnetlink_rcv_batch+0x3f8/0x620 [nfnetlink] > [ 11.698602] [000000001071d570] nfnetlink_rcv+0x110/0x140 [nfnetlink] > [ 11.698611] [000000000093e82c] netlink_unicast+0x12c/0x1e0 > [ 11.698620] [000000000093f484] netlink_sendmsg+0x324/0x360 > > > > Full kernel configuration file as well full dmesg messages are > available at https://github.com/mator/sparc64-dmesg/ > > system info: > > $ gcc -v > Using built-in specs. > COLLECT_GCC=gcc > COLLECT_LTO_WRAPPER=/usr/lib/gcc/sparc64-linux-gnu/8/lto-wrapper > Target: sparc64-linux-gnu > Configured with: ../src/configure -v --with-pkgversion='Debian > 8.3.0-7' --with-bugurl=file:///usr/share/doc/gcc-8/README.Bugs > --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --prefix=/usr > --with-gcc-major-version-only --program-suffix=-8 > --program-prefix=sparc64-linux-gnu- --enable-shared > --enable-linker-build-id --libexecdir=/usr/lib > --without-included-gettext --enable-threads=posix --libdir=/usr/lib > --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug > --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new > --enable-gnu-unique-object --disable-libquadmath > --disable-libquadmath-support --enable-plugin --enable-default-pie > --with-system-zlib --disable-libphobos --enable-objc-gc=auto > --enable-multiarch --disable-werror --with-cpu-32=ultrasparc > --enable-targets=all --with-long-double-128 --enable-multilib > --enable-checking=release --build=sparc64-linux-gnu > --host=sparc64-linux-gnu --target=sparc64-linux-gnu > Thread model: posix > gcc version 8.3.0 (Debian 8.3.0-7) > > # ldconfig -V > ldconfig (Debian GLIBC 2.28-10) 2.28 > > # ld -V > GNU ld (GNU Binutils for Debian) 2.31.1 > > PS: i wasn't able to trace which kernel version introduced this > possible deadlock... but tried (from top git tag v5.2-rc1 to bottom) > up to 4.13 kernel version... > > - E-mail : kadlec@xxxxxxxxxxxxxxxxx, kadlecsik.jozsef@xxxxxxxxxxxxx PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences H-1525 Budapest 114, POB. 49, Hungary