On Thursday, December 12, 2024 10:09 PM, Fijalkowski, Maciej wrote: >On Wed, Dec 11, 2024 at 09:45:32PM +0800, Song Yoong Siang wrote: >> Improve XDP_SETUP_PROG process by avoiding unnecessary link down event. >> [...] > >I agree we could do better than igc_close/igc_open pair, but have you >tried igc_down/igc_open instead? > Hi Fijalkowski Maciej, Thanks for your comment. I did 2 experiments on your suggestion: 1. with igc_down/igc_open, kernel dump observed after load xdp prog, as shown in dmesg below: [ 87.957526] ------------[ cut here ]------------ [ 87.962168] remove_proc_entry: removing non-empty directory 'irq/150', leaking at least 'enp2s0' [ 87.971014] WARNING: CPU: 17 PID: 3690 at fs/proc/generic.c:717 remove_proc_entry+0x19a/0x1b0 [ 87.979581] Modules linked in: algif_hash(E) af_alg(E) nf_conntrack_netlink(E) xfrm_user(E) xfrm_algo(E) xt_addrtype(E) br_netfilter(E) nft_masq(E) xt_CHECKSUM(E) xt_MASQUERADE(E) xt_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_tcpudp(E) nft_compat(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) nfnetlink(E) bridge(E) stp(E) llc(E) overlay(E) binfmt_misc(E) intel_uncore_frequency(E) intel_uncore_frequency_common(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) processor_thermal_device_pci(E) kvm(E) processor_thermal_device(E) nls_iso8859_1(E) mei_hdcp(E) cmdlinepart(E) intel_rapl_msr(E) processor_thermal_wt_hint(E) hid_sensor_als(E) processor_thermal_rfim(E) rapl(E) spi_nor(E) processor_thermal_rapl(E) hid_sensor_trigger(E) intel_cstate(E) wmi_bmof(E) mtd(E) industrialio_triggered_buffer(E) intel_rapl_common(E) kfifo_buf(E) mei_me(E) hid_sensor_iio_common(E) processor_thermal_wt_req(E) processor_thermal_power_floor(E) idma64(E) [ 87.979631] processor_thermal_mbox(E) mei(E) industrialio(E) int340x_thermal_zone(E) virt_dma(E) intel_pmc_core(E) int3400_thermal(E) pmt_telemetry(E) intel_hid(E) acpi_thermal_rel(E) mac_hid(E) pmt_class(E) sparse_keymap(E) acpi_pad(E) acpi_tad(E) sch_fq_codel(E) msr(E) parport_pc(E) ppdev(E) lp(E) parport(E) ramoops(E) reed_solomon(E) pstore_blk(E) pstore_zone(E) efi_pstore(E) ip_tables(E) x_tables(E) autofs4(E) btrfs(E) blake2b_generic(E) xor(E) raid6_pq(E) libcrc32c(E) dm_mirror(E) dm_region_hash(E) dm_log(E) hid_sensor_custom(E) hid_sensor_hub(E) hid_generic(E) intel_ishtp_hid(E) hid(E) i915(E) prime_numbers(E) drm_buddy(E) ttm(E) i2c_algo_bit(E) mxl_gpy(E) polynomial(E) drm_display_helper(E) crct10dif_pclmul(E) cec(E) dwc3(E) dwmac_intel(E) crc32_pclmul(E) stmmac(E) ghash_clmulni_intel(E) rc_core(E) ulpi(E) spi_pxa2xx_platform(E) udc_core(E) sha512_ssse3(E) i2c_i801(E) dw_dmac(E) sha256_ssse3(E) pcs_xpcs(E) i2c_mux(E) nvme(E) dw_dmac_core(E) drm_kms_helper(E) sha1_ssse3(E) e1000e(E) phylink(E) i2c_smbus(E) [ 88.068050] spi_pxa2xx_core(E) igc(E) nvme_core(E) ahci(E) libahci(E) intel_ish_ipc(E) drm(E) dwc3_pci(E) intel_ishtp(E) video(E) pinctrl_alderlake(E) wmi(E) pinctrl_intel(E) ax88179_178a(E) usbnet(E) mii(E) uas(E) usb_storage(E) aesni_intel(E) crypto_simd(E) cryptd(E) [ 88.181533] CPU: 17 UID: 0 PID: 3690 Comm: ip Tainted: G U E 6.13.0-rc1-siang+ #36 [ 88.190327] Tainted: [U]=USER, [E]=UNSIGNED_MODULE [ 88.195136] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-S ADP-S DDR5 UDIMM CRB, BIOS ADLSFWI1.R00.3032.B00.2201111106 01/11/2022 [ 88.208957] RIP: 0010:remove_proc_entry+0x19a/0x1b0 [ 88.213855] Code: 80 d8 5c b5 48 85 c0 48 8d 90 78 ff ff ff 48 0f 45 c2 49 8b 54 24 78 4c 8b 80 a0 00 00 00 48 8b 92 a0 00 00 00 e8 a6 c9 bb ff <0f> 0b e9 40 ff ff ff e8 aa 4b a8 00 66 2e 0f 1f 84 00 00 00 00 00 [ 88.232613] RSP: 0018:ffffba1a83c1f3a8 EFLAGS: 00010282 [ 88.237851] RAX: 0000000000000000 RBX: ffff8dc340294e40 RCX: 0000000000000000 [ 88.245003] RDX: 0000000000000002 RSI: ffffffffb55a0e36 RDI: 00000000ffffffff [ 88.252153] RBP: ffffba1a83c1f3d8 R08: 0000000000000000 R09: ffffba1a83c1f210 [ 88.259302] R10: ffffba1a83c1f208 R11: ffffffffb5f50ec8 R12: ffff8dc3c2606d80 [ 88.266470] R13: ffff8dc3c2606e00 R14: 0000000000000000 R15: 0000000000000097 [ 88.273626] FS: 00007f7018866b80(0000) GS:ffff8dcacf880000(0000) knlGS:0000000000000000 [ 88.281734] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 88.287505] CR2: 00005603b5622d88 CR3: 0000000195582000 CR4: 0000000000752ef0 [ 88.294663] PKRU: 55555554 [ 88.297400] Call Trace: [ 88.299878] <TASK> [ 88.302009] ? show_regs+0x69/0x80 [ 88.305444] ? __warn+0x8d/0x140 [ 88.308702] ? remove_proc_entry+0x19a/0x1b0 [ 88.312999] ? report_bug+0x1c5/0x1d0 [ 88.316692] ? irq_work_queue+0x40/0x60 [ 88.320558] ? handle_bug+0x63/0xa0 [ 88.324075] ? exc_invalid_op+0x19/0x70 [ 88.327938] ? asm_exc_invalid_op+0x1b/0x20 [ 88.332154] ? remove_proc_entry+0x19a/0x1b0 [ 88.336449] ? remove_proc_entry+0x19a/0x1b0 [ 88.340745] unregister_irq_proc+0xe4/0x110 [ 88.344960] free_desc+0x41/0xd0 [ 88.348218] ? irq_domain_free_irqs+0x138/0x1b0 [ 88.352773] ? kfree+0x25b/0x2f0 [ 88.356031] irq_free_descs+0x4c/0x70 [ 88.359720] irq_domain_free_irqs+0x151/0x1b0 [ 88.364107] msi_domain_free_locked.part.0+0x186/0x1b0 [ 88.369270] msi_domain_free_irqs_all_locked+0x6d/0x90 [ 88.374433] pci_msi_teardown_msi_irqs+0x3f/0x50 [ 88.379078] pci_free_msi_irqs+0x12/0x40 [ 88.383032] pci_disable_msix+0x51/0x70 [ 88.386894] igc_reset_interrupt_capability+0x2c/0xc0 [igc] ptp4l[88.192]: p[ 88.392520] __igc_open+0x356/0x630 [igc] [ 88.397927] igc_open+0x31/0x60 [igc] ort 1 (enp2s0): [ 88.401635] igc_xdp_set_prog+0xa0/0xf0 [igc] [ 88.407389] igc_bpf+0x46/0x50 [igc] SLAVE to LISTENI[ 88.411012] dev_xdp_install+0x85/0x140 [ 88.416247] ? __pfx_igc_bpf+0x10/0x10 [igc] [ 88.420546] dev_xdp_attach+0x20f/0x5f0 [ 88.424411] dev_change_xdp_fd+0x1f3/0x230 [ 88.428534] do_setlink.constprop.0+0x695/0x11b0 [ 88.433180] ? _raw_spin_unlock+0x19/0x40 [ 88.437220] ? __nla_validate_parse+0x5a/0xe80 [ 88.441694] ? __handle_mm_fault+0xbb2/0x1060 [ 88.446076] ? ns_capable+0x2d/0x60 [ 88.449592] ? netlink_ns_capable+0x3c/0x50 [ 88.453807] rtnl_newlink+0x8bb/0xd10 [ 88.457497] ? count_memcg_events.constprop.0+0x1e/0x40 [ 88.462750] ? security_capable+0x69/0x190 [ 88.466879] ? __pfx_rtnl_newlink+0x10/0x10 [ 88.471091] rtnetlink_rcv_msg+0x35b/0x400 [ 88.475212] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 [ 88.479856] netlink_rcv_skb+0x59/0x110 [ 88.483721] rtnetlink_rcv+0x15/0x20 [ 88.487325] netlink_unicast+0x23c/0x350 [ 88.491273] netlink_sendmsg+0x208/0x460 [ 88.495228] ____sys_sendmsg+0x3c1/0x3f0 [ 88.499183] ___sys_sendmsg+0x88/0xd0 [ 88.502872] ? debug_smp_processor_id+0x17/0x20 [ 88.507426] ? mod_objcg_state+0xb6/0x2f0 [ 88.511464] ? mod_objcg_state+0xb6/0x2f0 [ 88.515502] ? __memcg_slab_free_hook+0xdf/0x130 [ 88.520147] __sys_sendmsg+0x74/0xd0 [ 88.523754] __x64_sys_sendmsg+0x1d/0x30 [ 88.527702] x64_sys_call+0x92e/0x2140 [ 88.531482] do_syscall_64+0x4d/0x120 [ 88.535170] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 88.540247] RIP: 0033:0x7f7018727b17 [ 88.543853] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 [ 88.562625] RSP: 002b:00007ffd53158938 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 88.570215] RAX: ffffffffffffffda RBX: 00000000675af93b RCX: 00007f7018727b17 [ 88.577369] RDX: 0000000000000000 RSI: 00007ffd531589a0 RDI: 0000000000000003 [ 88.584528] RBP: 0000000000000000 R08: 0000000000000001 R09: 00005603bd0ecd90 [ 88.591686] R10: 00005603bd0ec360 R11: 0000000000000246 R12: 0000000000000001 [ 88.598844] R13: 00007ffd53158a50 R14: 0000000000000000 R15: 00005603b55d8040 NG on ANNOUNCE_R[ 88.606005] </TASK> [ 88.609592] ---[ end trace 0000000000000000 ]--- 2. with igc_down/igc_up, there is still link down and lost sync event, as shown in ptp4l log below: ptp4l[183.164]: rms 8 max 15 freq +1012 +/- 4 delay 8 +/- 0 ptp4l[184.164]: rms 3 max 6 freq +1018 +/- 5 delay 8 +/- 0 ptp4l[185.164]: rms 4 max 6 freq +1026 +/- 3 delay 8 +/- 0 [ 185.998988] NET: Registered PF_ALG protocol family ptp4l[186.064]: port 1 (enp2s0): link down ptp4l[186.064]: port 1 (enp2s0): SLAVE to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED) ptp4l[186.124]: selected local clock 22abbc.fffe.bb1234 as best master ptp4l[186.124]: port 1 (enp2s0): assuming the grand master role ptp4l[186.124]: port 1 (enp2s0): master state recommended in slave only mode ptp4l[186.124]: port 1 (enp2s0): defaultDS.priority1 probably misconfigured [ 188.375104] igc 0000:02:00.0 enp2s0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX ptp4l[188.896]: port 1 (enp2s0): link up ptp4l[188.928]: port 1 (enp2s0): FAULTY to LISTENING on INIT_COMPLETE ptp4l[192.100]: port 1 (enp2s0): new foreign master 44abbc.fffe.bb2144-1 ptp4l[194.100]: selected best master clock 44abbc.fffe.bb2144 ptp4l[194.100]: port 1 (enp2s0): LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[194.974]: port 1 (enp2s0): UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED ptp4l[195.099]: rms 172 max 375 freq +895 +/- 225 delay 10 +/- 2 ptp4l[196.099]: rms 126 max 245 freq +739 +/- 98 delay 13 +/- 0 ptp4l[197.099]: rms 80 max 97 freq +986 +/- 45 delay 13 +/- 0 Thanks & Regards Siang >> + } >> + } >> >> return 0; >> } >> -- >> 2.34.1 >>