Re: Kernel oops with 6.14 when enabling TLS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/4/25 20:44, Vlastimil Babka wrote:
On 3/4/25 20:39, Hannes Reinecke wrote:
[ .. ]

Good news and bad news ...
Good news: TLS works again!
Bad news: no errors.

Wait, did you add a WARN_ON_ONCE() to the put_page() as I suggested? If yes
and there was no error, it would have to be leaking the page. Or the path
uses folio_put() and we'd need to put the warning there.

That triggers:
[   42.364339] page dumped because: VM_WARN_ON_FOLIO(folio_test_slab(folio))
[   42.364379] ------------[ cut here ]------------
[ 42.375500] WARNING: CPU: 0 PID: 236 at ./include/linux/mm.h:1564 sk_msg_free_elem+0x157/0x180 [ 42.375642] Modules linked in: tls(E) nvme_tcp(E) af_packet(E) iscsi_ibft(E) iscsi_boot_sysfs(E) xfs(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) iTCO_wdt(E) intel_pmc_bxt(E) intel_rapl_msr(E) iTCO_vendor_support(E) intel_rapl_common(E) i2c_i801(E) bnxt_en(E) i2c_mux(E) lpc_ich(E) mfd_core(E) i2c_smbus(E) virtio_balloon(E) joydev(E) button(E) nvme_fabrics(E) nvme_keyring(E) nvme_core(E) fuse(E) nvme_auth(E) efi_pstore(E) configfs(E) dmi_sysfs(E) ip_tables(E) x_tables(E) hid_generic(E) usbhid(E) ahci(E) libahci(E) libata(E) virtio_scsi(E) sd_mod(E) scsi_dh_emc(E) scsi_dh_rdac(E) scsi_dh_alua(E) qxl(E) sg(E) ghash_clmulni_intel(E) xhci_pci(E) drm_client_lib(E) drm_exec(E) drm_ttm_helper(E) sha512_ssse3(E) xhci_hcd(E) ttm(E) sha256_ssse3(E) drm_kms_helper(E) scsi_mod(E) sha1_ssse3(E) usbcore(E) scsi_common(E) drm(E) serio_raw(E) btrfs(E) blake2b_generic(E) xor(E) raid6_pq(E) efivarfs(E) qemu_fw_cfg(E) virtio_rng(E) aesni_intel(E) crypto_simd(E) cryptd(E) [ 42.393292] CPU: 0 UID: 0 PID: 236 Comm: kworker/0:1H Kdump: loaded Tainted: G E 6.14.0-rc4-default+ #316 cadaa81909a6170d00e1f47f3fc0db03c6a03650
[   42.393303] Tainted: [E]=UNSIGNED_MODULE
[ 42.393305] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[   42.393310] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
[   42.393323] RIP: 0010:sk_msg_free_elem+0x157/0x180
[ 42.393331] Code: ff 48 c7 c6 d0 42 4f 82 48 89 ef e8 b3 63 8a ff 0f 0b 48 8d 6a ff e9 6c ff ff ff 48 c7 c6 a0 42 4f 82 48 89 ef e8 99 63 8a ff <0f> 0b e9 c7 fe ff ff 2b 87 78 01 00 00 8b 97 c0 00 00 00 29 d0 ba
[   42.393336] RSP: 0018:ffffc9000040b798 EFLAGS: 00010282
[ 42.393341] RAX: 000000000000003d RBX: ffff888110ab0858 RCX: 0000000000000027 [ 42.393344] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff88817f423748 [ 42.393347] RBP: ffffea0004295e00 R08: 0000000000000000 R09: 0000000000000001 [ 42.393350] R10: ffffc9000040b780 R11: ffffc9000040b4e0 R12: 0000000000000400 [ 42.393353] R13: ffff888110ab0818 R14: 0000000000000002 R15: ffff88810fa669d8 [ 42.393361] FS: 0000000000000000(0000) GS:ffff88817f400000(0000) knlGS:0000000000000000
[   42.393365] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 42.393369] CR2: 00007f56a6ea6da4 CR3: 000000011bfc0000 CR4: 0000000000350ef0
[   42.416071] Call Trace:
[   42.416078]  <TASK>
[   42.416084]  ? __warn+0x85/0x130
[   42.416095]  ? sk_msg_free_elem+0x157/0x180
[   42.418893]  ? report_bug+0xf8/0x1e0
[   42.418904]  ? handle_bug+0x50/0xa0
[   42.418910]  ? exc_invalid_op+0x13/0x60
[   42.418916]  ? asm_exc_invalid_op+0x16/0x20
[   42.418935]  ? sk_msg_free_elem+0x157/0x180
[   42.423206]  ? sk_msg_free_elem+0x157/0x180
[   42.423215]  __sk_msg_free+0x4f/0x100
[ 42.423224] tls_tx_records+0x118/0x190 [tls 80cce2d02933ba636eb5845a829121ac309b44ed] [ 42.426506] bpf_exec_tx_verdict+0x249/0x5e0 [tls 80cce2d02933ba636eb5845a829121ac309b44ed]
[   42.426519]  ? srso_return_thunk+0x5/0x5f
[   42.426526]  ? __pfx_stack_trace_consume_entry+0x10/0x10
[ 42.426572] tls_sw_sendmsg+0x72f/0x9f0 [tls 80cce2d02933ba636eb5845a829121ac309b44ed]
[   42.432016]  __sock_sendmsg+0x98/0xc0
[   42.432025]  sock_sendmsg+0x5c/0xa0
[   42.432030]  ? srso_return_thunk+0x5/0x5f
[   42.432034]  ? __sock_sendmsg+0x98/0xc0
[   42.432040]  ? srso_return_thunk+0x5/0x5f
[   42.436134]  ? sock_sendmsg+0x5c/0xa0
[ 42.436146] nvme_tcp_try_send_data+0x13f/0x410 [nvme_tcp 9f4f1c84141d3edfcd3e478eb7c2fb638b4a92b3]
[   42.436159]  ? srso_return_thunk+0x5/0x5f
[   42.439452]  ? sched_balance_newidle+0x2f6/0x400
[ 42.439468] nvme_tcp_try_send+0x299/0x330 [nvme_tcp 9f4f1c84141d3edfcd3e478eb7c2fb638b4a92b3] [ 42.439479] nvme_tcp_io_work+0x37/0xb0 [nvme_tcp 9f4f1c84141d3edfcd3e478eb7c2fb638b4a92b3]
[   42.443603]  process_scheduled_works+0x97/0x400
[   42.443614]  ? __pfx_worker_thread+0x10/0x10
[   42.443619]  worker_thread+0x105/0x240
[   42.443625]  ? __pfx_worker_thread+0x10/0x10
[   42.443630]  kthread+0xec/0x200
[   42.443639]  ? __pfx_kthread+0x10/0x10
[   42.443646]  ret_from_fork+0x30/0x50
[   42.443652]  ? __pfx_kthread+0x10/0x10
[   42.443658]  ret_from_fork_asm+0x1a/0x30
[   42.451127]  </TASK>
[   42.451131] ---[ end trace 0000000000000000 ]---

Not surprisingly, though, as the original code did a get_page(), so
there had to be a corresponding put_page() somewhere.

Cheers,

Hannes
--
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@xxxxxxxx                               +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux