Re: [External] : Re: Please backport: netfilter: nft_counter: Use u64_stats_t for statistic.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2024-10-04 at 09:39 +0200, Sebastian Andrzej Siewior wrote:
> On 2024-09-27 15:01:00 [-0400], Joseph Salisbury wrote:
> > Is it needed in all stable release patch sets, including v5.15?
> 
> Yes. I would appreciate backporting it all the way where the code is
> available. The dependencies
> 	1eacdd71b3436 ("netfilter: nft_counter: Disable BH in
> nft_counter_offload_stats().")
> 	a0b39e2dc7017 ("netfilter: nft_counter: Synchronize
> nft_counter_reset() against reader.")
> 
> were already routed via stable.
> The problem is that the seqcount has no lock associated so a reader
> could preempt a writer and then lockup spinning.

Hi,

this needs to be backported to all stable RT trees (just checked 4.19
and 6.1. 5.15 already has it). We observed the reader live-lock issue
in "nft_counter_fetch" on 6.1.120-rt47 (leading to a system stall) and
were also able to find it with lockdep (see stacktrace below).

I'm wondering if this patch could be applied to linux-stable, even if
it is just a performance optimization on non-rt kernels (not a fix).

The patch "netfilter: nft_counter: Use u64_stats_t for statistic"
cleanly applies on 6.1.y and 6.1.127-rt48.

Stacktrace from lockdep:
[   33.643632] ------------[ cut here ]------------
[   33.643637] WARNING: CPU: 0 PID: 972 at include/linux/seqlock.h:269
nft_counter_eval+0x6b/0xd0 [nf_tables]
[   33.643657] Modules linked in: br_netfilter bridge stp llc
xt_comment xt_recent xt_hl ip6_tables ip6t_rt ipt_REJECT nf_reject_ipv4
xt_LOG nf_log_syslog nft_limit xt_limit xt_addrtype xt_tcpudp
xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat
nf_tables libcrc32c nfnetlink rfkill intel_rapl_msr intel_rapl_common
ccp binfmt_misc kvm irqbypass ghash_clmulni_intel sha512_ssse3
sha512_generic sha256_ssse3 sha1_ssse3 ppdev snd_pcm snd_timer
aesni_intel snd crypto_simd cryptd soundcore pcspkr parport_pc iTCO_wdt
bochs parport drm_vram_helper intel_pmc_bxt drm_ttm_helper
iTCO_vendor_support button ttm drm_kms_helper watchdog sg joydev evdev
serio_raw drm fuse loop efi_pstore configfs qemu_fw_cfg ip_tables
x_tables autofs4 overlay nls_ascii nls_cp437 vfat fat ext4
crc32c_generic crc16 mbcache jbd2 xts ecb squashfs dm_verity dm_bufio
reed_solomon dm_mod sd_mod t10_pi crc64_rocksoft crc64 crc_t10dif
crct10dif_generic virtio_net net_failover ahci failover libahci
crct10dif_pclmul
[   33.643727]  crct10dif_common libata virtio_pci i2c_i801
crc32_pclmul scsi_mod crc32c_intel virtio_pci_legacy_dev i2c_smbus
psmouse virtio_pci_modern_dev virtio scsi_common virtio_ring lpc_ich
[   33.643739] CPU: 0 PID: 972 Comm: onboardservice Not tainted
6.1.120-rt47 #1
[   33.643742] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
0.0.0 02/06/2015
[   33.643744] RIP: 0010:nft_counter_eval+0x6b/0xd0 [nf_tables]
[   33.643759] Code: 52 3f 85 d2 74 26 65 8b 05 ba bd 52 3f 85 c0 75 1b
65 8b 05 e7 b3 52 3f a9 ff ff ff 7f 75 0d 65 8b 05 dd ba 52 3f 85 c0 74
02 <0f> 0b ff 74 24 20 4c 8d 6d 08 45 31 c9 31 c9 41 b8 01 00 00 00 31
[   33.643776] RSP: 0018:ffffa045007736a0 EFLAGS: 00010202
[   33.643778] RAX: 0000000000000001 RBX: ffffc044ffc2ae80 RCX:
00000000000026af
[   33.643780] RDX: 0000000000000001 RSI: ffff8d29050db388 RDI:
ffffffffc0af49a4
[   33.643781] RBP: ffff8d293f638060 R08: 0000000000000000 R09:
0000000000000000
[   33.643782] R10: 0000000000000001 R11: 000000009bb77572 R12:
ffffa04500773920
[   33.643783] R13: ffff8d29011db358 R14: ffff8d29011db208 R15:
ffff8d29011db240
[   33.643807] FS:  000000c000047c90(0000) GS:ffff8d293f600000(0000)
knlGS:0000000000000000
[   33.643811] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   33.643813] CR2: 000000c0005fe000 CR3: 000000003a212000 CR4:
00000000003506f0
[   33.643836] Call Trace:
[   33.643840]  <TASK>
[   33.643844]  ? __warn+0x82/0xe0
[   33.643852]  ? nft_counter_eval+0x6b/0xd0 [nf_tables]
[   33.643877]  ? report_bug+0x10e/0x180
[   33.643889]  ? handle_bug+0x41/0x70
[   33.643895]  ? exc_invalid_op+0x13/0x60
[   33.643899]  ? asm_exc_invalid_op+0x16/0x20
[   33.643912]  ? nft_counter_eval+0x24/0xd0 [nf_tables]
[   33.643931]  ? nft_counter_eval+0x6b/0xd0 [nf_tables]
[   33.643962]  nft_do_chain+0x45b/0x690 [nf_tables]
[   33.644025]  nft_do_chain_ipv4+0x78/0xa0 [nf_tables]
[   33.644046]  nf_hook_slow+0x41/0xc0
[   33.644054]  __ip_local_out+0x14c/0x300
[   33.644062]  ? ip_output+0xb0/0xb0
[   33.644074]  __ip_queue_xmit+0x1c0/0x7f0
[   33.644086]  __tcp_transmit_skb+0xabe/0xcb0
[   33.644107]  tcp_write_xmit+0x521/0x14a0
[   33.644117]  __tcp_push_pending_frames+0x32/0xf0
[   33.644120]  tcp_sendmsg_locked+0x4cd/0xc20
[   33.644133]  tcp_sendmsg+0x27/0x40
[   33.644137]  __sock_sendmsg+0x58/0x70
[   33.644142]  sock_write_iter+0x9a/0x100
[   33.644151]  vfs_write+0x2c8/0x330
[   33.644164]  ksys_write+0xc3/0xf0
[   33.644169]  do_syscall_64+0x55/0xb0
[   33.644173]  ? lock_acquire+0xc4/0x2d0
[   33.644178]  ? find_held_lock+0x2b/0x80
[   33.644182]  ? finish_task_switch.isra.0+0xca/0x380
[   33.644186]  ? lock_release+0xd0/0x2d0
[   33.644191]  ? lockdep_hardirqs_on_prepare+0xdc/0x190
[   33.644196]  ? finish_task_switch.isra.0+0xcf/0x380
[   33.644201]  ? __schedule+0x3f8/0xd20
[   33.644206]  ? restore_fpregs_from_fpstate+0x38/0x90
[   33.644211]  ? trace_x86_fpu_regs_activated+0x1f/0xb0
[   33.644213]  ? switch_fpu_return+0x58/0x90
[   33.644218]  ? exit_to_user_mode_prepare+0x1af/0x250
[   33.644223]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[   33.644227] RIP: 0033:0x40720e
[   33.644230] Code: 48 83 ec 38 e8 13 00 00 00 48 83 c4 38 5d c3 cc cc
cc cc cc cc cc cc cc cc cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f
05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48
[   33.644232] RSP: 002b:000000c000069980 EFLAGS: 00000216 ORIG_RAX:
0000000000000001
[   33.644234] RAX: ffffffffffffffda RBX: 0000000000000009 RCX:
000000000040720e
[   33.644236] RDX: 000000000000008c RSI: 000000c0001746c0 RDI:
0000000000000009
[   33.644237] RBP: 000000c0000699c0 R08: 0000000000000000 R09:
0000000000000000
[   33.644238] R10: 0000000000000000 R11: 0000000000000216 R12:
000000c000069b00
[   33.644239] R13: 000000000000000e R14: 000000c00016ed00 R15:
0000000000a88360
[   33.644250]  </TASK>
[   33.644250] irq event stamp: 10266
[   33.644251] hardirqs last  enabled at (10268): [<ffffffff96339836>]
vprintk_store+0x326/0x550
[   33.644256] hardirqs last disabled at (10269): [<ffffffff9633987c>]
vprintk_store+0x36c/0x550
[   33.644259] softirqs last  enabled at (9900): [<ffffffff962af77e>]
__local_bh_enable_ip+0xfe/0x140
[   33.644264] softirqs last disabled at (9904): [<ffffffffc0af49a4>]
nft_counter_eval+0x24/0xd0 [nf_tables]
[   33.644277] ---[ end trace 0000000000000000 ]---

Best regards,
Felix

> 
> Sebastian

-- 
Siemens AG
Linux Expert Center
Friedrich-Ludwig-Bauer-Str. 3
85748 Garching, Germany






[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux