On 19.8.2022 15.04, Hans de Goede wrote:
Hi All, While dogfooding 6.0-rc1 on my main workstation I noticed the following new WARN/oops: Aug 19 12:22:39 shalem kernel: ------------[ cut here ]------------ Aug 19 12:22:39 shalem kernel: DMA-API: xhci_hcd 0000:30:00.3: cacheline tracking EEXIST, overlapping mappings aren't supported Aug 19 12:22:39 shalem kernel: WARNING: CPU: 14 PID: 8321 at kernel/dma/debug.c:570 add_dma_entry+0x1ed/0x270 Aug 19 12:22:39 shalem kernel: Modules linked in: rfcomm snd_seq_dummy snd_hrtimer qrtr bnep vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) binfmt_misc vfat fat intel_rapl_msr intel_rapl_common iwlmvm snd_hda_codec_realtek snd_hda_codec_generic edac_mce_amd mac80211 ledtrig_audio snd_hda_codec_hdmi kvm_amd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi libarc4 kvm snd_hda_codec irqbypass iwlwifi snd_hda_core snd_usb_audio btusb rapl btrtl snd_usbmidi_lib iwlmei btbcm snd_hwdep btintel snd_rawmidi snd_seq mc btmtk snd_seq_device cfg80211 wmi_bmof bluetooth snd_pcm joydev snd_timer rfkill mei r8169 snd soundcore i2c_piix4 k10temp gpio_amdpt gpio_generic zram dm_crypt amdgpu drm_ttm_helper ttm iommu_v2 crct10dif_pclmul gpu_sched crc32_pclmul crc32c_intel drm_buddy hid_lg_g15 drm_display_helper nvme ghash_clmulni_intel nvme_core ccp cec sp5100_tco wmi video hid_logitech_hidpp uas usb_storage hid_logitech_dj ip6_tables ip_tables fuse i2c_dev Aug 19 12:22:39 shalem kernel: Unloaded tainted modules: acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 amd64_edac():1 acpi_cpufreq():1 amd64_edac():1 acpi_cpufreq():1 amd64_edac():1 acpi_cpufreq():1 amd64_edac():1 acpi_cpufreq():1 amd64_edac():1 amd64_edac():1 acpi_cpufreq():1 fjes():1 amd64_edac():1 acpi_cpufreq():1 fjes():1 amd64_edac():1 acpi_cpufreq():1 fjes():1 pcc_cpufreq():1 amd64_edac():1 acpi_cpufreq():1 fjes():1 amd64_edac():1 acpi_cpufreq():1 pcc_cpufreq():1 fjes():1 acpi_cpufreq():1 amd64_edac():1 pcc_cpufreq():1 acpi_cpufreq():1 fjes():1 amd64_edac():1 pcc_cpufreq():1 acpi_cpufreq():1 fjes():1 amd64_edac():1 pcc_cpufreq():1 fjes():1 amd64_edac():1 acpi_cpufreq():1 pcc_cpufreq():1 fjes():1 amd64_edac():1 acpi_cpufreq():1 pcc_cpufreq():1 amd64_edac():1 fjes():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 fjes():1 acpi_cpufreq():1 pcc_cpufreq():1 Aug 19 12:22:39 shalem kernel: fjes():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 fjes():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 fjes():1 pcc_cpufreq():1 acpi_cpufreq():1 fjes():1 acpi_cpufreq():1 pcc_cpufreq():1 fjes():1 acpi_cpufreq():1 Aug 19 12:22:39 shalem kernel: CPU: 14 PID: 8321 Comm: kworker/u64:57 Tainted: G OE ------- --- 6.0.0-0.rc1.20220817git3cc40a443a04.14.fc38.x86_64 #1 Aug 19 12:22:39 shalem kernel: Hardware name: Micro-Star International Co., Ltd. MS-7C95/B550M PRO-VDH WIFI (MS-7C95), BIOS 2.90 12/23/2021 Aug 19 12:22:39 shalem kernel: Workqueue: events_unbound async_run_entry_fn Aug 19 12:22:39 shalem kernel: RIP: 0010:add_dma_entry+0x1ed/0x270 Aug 19 12:22:39 shalem kernel: Code: ff 0f 84 97 00 00 00 48 8b 5f 50 48 85 db 75 03 48 8b 1f e8 15 96 85 00 48 89 c6 48 89 da 48 c7 c7 30 c9 86 85 e8 73 99 ce 00 <0f> 0b 48 85 ed 0f 85 a9 50 cf 00 8b 05 f2 09 23 02 85 c0 0f 85 ca Aug 19 12:22:39 shalem kernel: RSP: 0018:ffffa78dc2c0ba20 EFLAGS: 00010282 Aug 19 12:22:39 shalem kernel: RAX: 0000000000000060 RBX: ffff8ac186a636c0 RCX: 0000000000000000 Aug 19 12:22:39 shalem kernel: RDX: 0000000000000001 RSI: ffffffff858d07a4 RDI: 00000000ffffffff Aug 19 12:22:39 shalem kernel: RBP: ffff8ac184f17a80 R08: 0000000000000001 R09: 0000000000000001 Aug 19 12:22:39 shalem kernel: R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 Aug 19 12:22:39 shalem kernel: R13: 0000000000000001 R14: 0000000000000206 R15: 00000000042f65bd Aug 19 12:22:39 shalem kernel: FS: 0000000000000000(0000) GS:ffff8ac409a00000(0000) knlGS:0000000000000000 Aug 19 12:22:39 shalem kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 19 12:22:39 shalem kernel: CR2: 0000000000000000 CR3: 000000014c028000 CR4: 0000000000750ee0 Aug 19 12:22:39 shalem kernel: PKRU: 55555554 Aug 19 12:22:39 shalem kernel: Call Trace: Aug 19 12:22:39 shalem kernel: <TASK> Aug 19 12:22:39 shalem kernel: dma_map_page_attrs+0x82/0x2d0 Aug 19 12:22:39 shalem kernel: ? _raw_spin_unlock_irqrestore+0x30/0x60 Aug 19 12:22:39 shalem kernel: usb_hcd_map_urb_for_dma+0x423/0x500 Aug 19 12:22:39 shalem kernel: usb_hcd_submit_urb+0x9b/0xb80 Aug 19 12:22:39 shalem kernel: ? lockdep_init_map_type+0x62/0x260 Aug 19 12:22:39 shalem kernel: usb_start_wait_urb+0x55/0x140 Aug 19 12:22:39 shalem kernel: usb_control_msg+0xc9/0x120 Aug 19 12:22:39 shalem kernel: hub_ext_port_status+0x89/0x110 Aug 19 12:22:39 shalem kernel: hub_activate+0x11c/0x8b0 Aug 19 12:22:39 shalem kernel: ? usb_dev_thaw+0x10/0x10 Aug 19 12:22:39 shalem kernel: hub_resume+0x23/0xe0 Aug 19 12:22:39 shalem kernel: ? usb_dev_thaw+0x10/0x10 Aug 19 12:22:39 shalem kernel: usb_resume_interface.constprop.0.isra.0+0x89/0xd0 Aug 19 12:22:39 shalem kernel: usb_resume_both+0x109/0x180 Aug 19 12:22:39 shalem kernel: ? usb_dev_thaw+0x10/0x10 Aug 19 12:22:39 shalem kernel: usb_resume+0x15/0x60 Aug 19 12:22:39 shalem kernel: dpm_run_callback+0x53/0xc0 Aug 19 12:22:39 shalem kernel: device_resume+0xa8/0x200 Aug 19 12:22:39 shalem kernel: async_resume+0x19/0x30 Aug 19 12:22:39 shalem kernel: async_run_entry_fn+0x30/0x130 Aug 19 12:22:39 shalem kernel: process_one_work+0x2a0/0x600 Aug 19 12:22:39 shalem kernel: worker_thread+0x4f/0x3a0 Aug 19 12:22:39 shalem kernel: ? process_one_work+0x600/0x600 Aug 19 12:22:39 shalem kernel: kthread+0xf5/0x120 Aug 19 12:22:39 shalem kernel: ? kthread_complete_and_exit+0x20/0x20 Aug 19 12:22:39 shalem kernel: ret_from_fork+0x22/0x30 Aug 19 12:22:39 shalem kernel: </TASK> Aug 19 12:22:39 shalem kernel: irq event stamp: 2115 Aug 19 12:22:39 shalem kernel: hardirqs last enabled at (2123): [<ffffffff8418fb52>] vprintk_emit+0x342/0x350 Aug 19 12:22:39 shalem kernel: hardirqs last disabled at (2130): [<ffffffff8418fb01>] vprintk_emit+0x2f1/0x350 Aug 19 12:22:39 shalem kernel: softirqs last enabled at (0): [<ffffffff840f5079>] copy_process+0xa49/0x1e50 Aug 19 12:22:39 shalem kernel: softirqs last disabled at (0): [<0000000000000000>] 0x0 Aug 19 12:22:39 shalem kernel: ---[ end trace 0000000000000000 ]--- Aug 19 12:22:39 shalem kernel: DMA-API: Mapped at: Aug 19 12:22:39 shalem kernel: debug_dma_map_page+0x66/0x100 Aug 19 12:22:39 shalem kernel: dma_map_page_attrs+0x82/0x2d0 Aug 19 12:22:39 shalem kernel: usb_hcd_map_urb_for_dma+0x423/0x500 Aug 19 12:22:39 shalem kernel: usb_hcd_submit_urb+0x9b/0xb80 Aug 19 12:22:39 shalem kernel: usb_start_wait_urb+0x55/0x140
I'm guessing that the data buffer where hub driver stores port status data was already dma mapped when hub driver submitted a new get port status URB. usb core tries to map the buffer again at urb subit, triggering this. Hub driver allocates one data buffer once, and uses it to store both hub status data and port status data. (union &hub->status->port and &hub->status->hub) So probably the previous URB (get port status, or get hub status) never successfully completed, or for some reason didn't unmap dma at completion. Looks like this issue is hit in resume, so maybe somethin went wrong in completing the hub URB in suspend? Was this easy to reproduce? any external hubs connected? One thing to check would be if usbmon shows an uncompleted hub URB when a new URB is submitted, triggering this. Thanks -Mathias