On Fri, Nov 01, 2024 at 12:37:41PM +0000, Alasdair McWilliam wrote: > Good day, > > On 27/09/2024 12:32, Thorsten Leemhuis wrote: > > > [CCing a few people that were involved in mainlining the culprit > > (8adbf5a42341f6e ("ice: remove af_xdp_zc_qps bitmap") in case they want > > to provide advice] > > > > On 13.09.24 17:54, Alasdair McWilliam wrote: > >> On 05/09/2024 13:50, Alasdair McWilliam wrote: > >> > >>>> We've been working recently on somewhat related issues and it looks like > >>>> not every commit from [0] has been backported. > >>>> > >>>> $ git log --oneline v6.1.103..v6.1.104 drivers/net/ethernet/intel/ice/ > >>>> 5a80b682e3e1 ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog > >>>> 8782f0fcb19d ice: replace synchronize_rcu with synchronize_net > >>>> 15115033f056 ice: don't busy wait for Rx queue disable in ice_qp_dis() > >>>> 3dbc58774e58 ice: respect netif readiness in AF_XDP ZC related ndo's > >>>> > >>>> can you apply the rest of it on top of 6.1.107 and see the result? > >> > >>> The first one I've attempted doesn't apply cleanly to 6.1.107. > >>> > >>> Eg: d59227179949 ("ice: modify error handling when setting XSK pool in > >>> ndo_bpf"). The above looks to have been based on code from around 6.8 or > >>> 6.9 where the makeup of routines like ice_qp_ena() has changed. Looks > >>> like this happened around a292ba981324 ("ice: make ice_vsi_cfg_txq() > >>> static"). > >>> > >>> Should I try and apply a292ba981324 as well? > >> > >> I just wondered if there was perhaps any further feedback on the above. > > > > Hmmm. No reply afaics -- but that's how it is sometimes with > > stable/longterm kernels series, as mainline developers are not required > > to participate in their development. > > > > Still it would be good to fix the problem. So unless the developers come > > up with plan, it might be best to just revert a62c50545b4d in 6.1.y; > > guess asking Greg to do so might be best way ahead unless some solutions > > comes into sight within a few days. > > > > It's been a minute since I've looked at this due to other commitments > but accidentally bumped into the fault again when testing the latest 6.6 > LTS for a new feature of our software. (I forgot to revert the commit > for "ice: remove af_xdp_zc_qps bitmap" in our build system.) > > This led me to wonder about the current version, and can trigger the > same crash on 6.11.5 [3]. > > Reverting "ice: remove af_xdp_zc_qps bitmap" [1] in the current mainline > is a little more complicated as commit ebc33a3f8d0a ("ice: improve > updating ice_{t,r}x_ring::xsk_pool") also changes things a little so the > reversion doesn't work cleanly. > > I have tweaked everything a little the below patch [2] applies cleanly > to 6.11.5 and 6.12-rc5 and seems to fix the fault. > > Thought I'd bubble this up as it's definitely still an issue in the > mainline kernel as of now. > > Thanks > Alasdair > Hello, Could you please share the reproduction steps? I will look into this. Larysa > [1] Commit adbf5a42341f6ea038d3626cd4437d9f0ad0b2dd > > [2] > https://github.com/OpenSource-THG/kernel-patches/tree/main/2024-11-ice-xskzc-page-fault > > [3] 6.11.5 ooops > > [ 565.069120] BUG: unable to handle page fault for address: > ffffa566707380c4 > [ 565.069144] #PF: supervisor read access in kernel mode > [ 565.069155] #PF: error_code(0x0000) - not-present page > [ 565.069167] PGD 100000067 P4D 100000067 PUD 20ef17067 PMD 0 > [ 565.069183] Oops: Oops: 0000 [#1] PREEMPT SMP PTI > [ 565.069195] CPU: 7 UID: 0 PID: 6967 Comm: tlndd.bin Kdump: loaded > Tainted: G E > 6.11.5-1.thg.836e8867d7.241031.135507.el9.x86_64 #1 > [ 565.069220] Tainted: [E]=UNSIGNED_MODULE > [ 565.069228] Hardware name: Supermicro SYS-1028R-TDW/X10DDW-i, BIOS > 3.2 12/16/2019 > [ 565.069241] RIP: 0010:ice_xsk_clean_rx_ring+0x37/0x110 [ice] > [ 565.069338] Code: 55 53 48 83 ec 08 44 0f b7 af a4 00 00 00 0f b7 af > a2 00 00 00 66 41 39 ed 74 33 48 89 fb 48 8b 4b 38 41 0f b7 c5 4c 8b 34 > c1 <41> f6 46 34 01 75 30 4c 89 f7 41 83 c5 01 e8 f6 0c 7e ce 31 c0 66 > [ 565.069365] RSP: 0018:ffffa5660f8f36d8 EFLAGS: 00010293 > [ 565.069375] RAX: 0000000000000000 RBX: ffff8bb105d38600 RCX: > ffff8bb184930000 > [ 565.069387] RDX: 0000000000000000 RSI: 0000000000000000 RDI: > ffff8bb105d38600 > [ 565.069400] RBP: 00000000000007ff R08: 000000000000050b R09: > 0000000000000000 > [ 565.069411] R10: ffff8bb10f910000 R11: 0000000000000020 R12: > 0000000000000004 > [ 565.069422] R13: 0000000000000000 R14: ffffa56670738090 R15: > ffff8bb1116b5740 > [ 565.069434] FS: 00007f677a5d1dc0(0000) GS:ffff8bb85fd80000(0000) > knlGS:0000000000000000 > [ 565.069447] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 565.069457] CR2: ffffa566707380c4 CR3: 0000000120164005 CR4: > 00000000001706f0 > [ 565.069470] Call Trace: > [ 565.069480] <TASK> > [ 565.069489] ? __die+0x20/0x70 > [ 565.069504] ? page_fault_oops+0x80/0x150 > [ 565.069517] ? exc_page_fault+0xcd/0x170 > [ 565.069531] ? asm_exc_page_fault+0x22/0x30 > [ 565.069546] ? ice_xsk_clean_rx_ring+0x37/0x110 [ice] > [ 565.069598] ice_clean_rx_ring+0x16e/0x190 [ice] > [ 565.069650] ice_down+0x2f8/0x3c0 [ice] > [ 565.069692] ice_xdp_setup_prog+0x193/0x460 [ice] > [ 565.069734] ice_xdp+0x7a/0xb0 [ice] > [ 565.069774] ? __pfx_ice_xdp+0x10/0x10 [ice] > [ 565.069813] dev_xdp_install+0xc7/0x100 > [ 565.069829] dev_xdp_attach+0x205/0x5d0 > [ 565.069841] do_setlink+0x7d3/0xc20 > [ 565.069853] ? dequeue_skb+0x80/0x4f0 > [ 565.069866] ? __nla_validate_parse+0x125/0x1d0 > [ 565.069880] __rtnl_newlink+0x4f7/0x630 > [ 565.069892] ? __kmalloc_cache_noprof+0x225/0x2b0 > [ 565.069905] rtnl_newlink+0x44/0x70 > [ 565.069915] rtnetlink_rcv_msg+0x15c/0x410 > [ 565.069928] ? avc_has_perm_noaudit+0x67/0xf0 > [ 565.069943] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 > [ 565.069956] netlink_rcv_skb+0x57/0x100 > [ 565.069969] netlink_unicast+0x246/0x370 > [ 565.069980] netlink_sendmsg+0x1f6/0x430 > [ 565.069991] ____sys_sendmsg+0x3be/0x3f0 > [ 565.070003] ? import_iovec+0x16/0x20 > [ 565.070015] ? copy_msghdr_from_user+0x6d/0xa0 > [ 565.070028] ___sys_sendmsg+0x88/0xd0 > [ 565.070038] ? __memcg_slab_free_hook+0xd5/0x120 > [ 565.070050] ? __inode_wait_for_writeback+0x7d/0xf0 > [ 565.070065] ? mod_objcg_state+0xc9/0x2f0 > [ 565.070076] __sys_sendmsg+0x59/0xa0 > [ 565.070086] ? syscall_trace_enter+0xfb/0x190 > [ 565.070098] do_syscall_64+0x60/0x180 > [ 565.070111] entry_SYSCALL_64_after_hwframe+0x76/0x7e > [ 565.070126] RIP: 0033:0x7f677ab0f94d > [ 565.070136] Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 0a 67 > f7 ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f > 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 44 24 08 e8 5e 67 f7 ff 48 > [ 565.070164] RSP: 002b:00007ffd1e4f7a60 EFLAGS: 00000293 ORIG_RAX: > 000000000000002e > [ 565.070178] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: > 00007f677ab0f94d > [ 565.070191] RDX: 0000000000000000 RSI: 000000001d698848 RDI: > 000000000000000a > [ 565.070203] RBP: 000000001d5350e0 R08: 0000000000000000 R09: > 0000000000465f98 > [ 565.070215] R10: 0000000000200000 R11: 0000000000000293 R12: > 000000001d535110 > [ 565.070227] R13: 000000000051d798 R14: 000000001d698830 R15: > 000000001d5384b0 > [ 565.070240] </TASK> > [ 565.070248] Modules linked in: bonding(E) tls(E) nft_fib_inet(E) > nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) > nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) > nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_ > defrag_ipv6(E) nf_defrag_ipv4(E) rfkill(E) ip_set(E) nf_tables(E) > libcrc32c(E) nfnetlink(E) vfat(E) fat(E) intel_rapl_msr(E) > intel_rapl_common(E) sb_edac(E) x86_pkg_temp_thermal(E) > intel_powerclamp(E) coretemp(E) kvm_intel(E) ipmi_ssif( > E) kvm(E) iTCO_wdt(E) intel_pmc_bxt(E) iTCO_vendor_support(E) rapl(E) > intel_cstate(E) intel_uncore(E) ast(E) i2c_i801(E) pcspkr(E) mei_me(E) > drm_shmem_helper(E) mxm_wmi(E) drm_kms_helper(E) i2c_mux(E) mei(E) > i2c_smbus(E) lpc_ich(E) ioat > dma(E) acpi_power_meter(E) ipmi_si(E) acpi_ipmi(E) joydev(E) > ipmi_devintf(E) ipmi_msghandler(E) acpi_pad(E) drm(E) fuse(E) ext4(E) > mbcache(E) jbd2(E) sd_mod(E) sg(E) ice(E) ahci(E) crct10dif_pclmul(E) > crc32_pclmul(E) crc32c_intel(E) lib > ahci(E) polyval_clmulni(E) igb(E) polyval_generic(E) libata(E) > ghash_clmulni_intel(E) > [ 565.070304] i2c_algo_bit(E) dca(E) libie(E) wmi(E) dm_mirror(E) > dm_region_hash(E) dm_log(E) dm_mod(E) > [ 565.071430] CR2: ffffa566707380c4 >