Re: 6.0-rc1 new XHCI cacheline tracking EEXIST WARN/oops

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 19.8.2022 15.04, Hans de Goede wrote:
Hi All,

While dogfooding 6.0-rc1 on my main workstation I noticed the following
new WARN/oops:

Aug 19 12:22:39 shalem kernel: ------------[ cut here ]------------
Aug 19 12:22:39 shalem kernel: DMA-API: xhci_hcd 0000:30:00.3: cacheline tracking EEXIST, overlapping mappings aren't supported
Aug 19 12:22:39 shalem kernel: WARNING: CPU: 14 PID: 8321 at kernel/dma/debug.c:570 add_dma_entry+0x1ed/0x270
Aug 19 12:22:39 shalem kernel: Modules linked in: rfcomm snd_seq_dummy snd_hrtimer qrtr bnep vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) binfmt_misc vfat fat intel_rapl_msr intel_rapl_common iwlmvm snd_hda_codec_realtek snd_hda_codec_generic edac_mce_amd mac80211 ledtrig_audio snd_hda_codec_hdmi kvm_amd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi libarc4 kvm snd_hda_codec irqbypass iwlwifi snd_hda_core snd_usb_audio btusb rapl btrtl snd_usbmidi_lib iwlmei btbcm snd_hwdep btintel snd_rawmidi snd_seq mc btmtk snd_seq_device cfg80211 wmi_bmof bluetooth snd_pcm joydev snd_timer rfkill mei r8169 snd soundcore i2c_piix4 k10temp gpio_amdpt gpio_generic zram dm_crypt amdgpu drm_ttm_helper ttm iommu_v2 crct10dif_pclmul gpu_sched crc32_pclmul crc32c_intel drm_buddy hid_lg_g15 drm_display_helper nvme ghash_clmulni_intel nvme_core ccp cec sp5100_tco wmi video hid_logitech_hidpp uas usb_storage hid_logitech_dj ip6_tables ip_tables fuse i2c_dev
Aug 19 12:22:39 shalem kernel: Unloaded tainted modules: acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 amd64_edac():1 acpi_cpufreq():1 amd64_edac():1 acpi_cpufreq():1 amd64_edac():1 acpi_cpufreq():1 amd64_edac():1 acpi_cpufreq():1 amd64_edac():1 amd64_edac():1 acpi_cpufreq():1 fjes():1 amd64_edac():1 acpi_cpufreq():1 fjes():1 amd64_edac():1 acpi_cpufreq():1 fjes():1 pcc_cpufreq():1 amd64_edac():1 acpi_cpufreq():1 fjes():1 amd64_edac():1 acpi_cpufreq():1 pcc_cpufreq():1 fjes():1 acpi_cpufreq():1 amd64_edac():1 pcc_cpufreq():1 acpi_cpufreq():1 fjes():1 amd64_edac():1 pcc_cpufreq():1 acpi_cpufreq():1 fjes():1 amd64_edac():1 pcc_cpufreq():1 fjes():1 amd64_edac():1 acpi_cpufreq():1 pcc_cpufreq():1 fjes():1 amd64_edac():1 acpi_cpufreq():1 pcc_cpufreq():1 amd64_edac():1 fjes():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 fjes():1 acpi_cpufreq():1 pcc_cpufreq():1
Aug 19 12:22:39 shalem kernel:  fjes():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 fjes():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 fjes():1 pcc_cpufreq():1 acpi_cpufreq():1 fjes():1 acpi_cpufreq():1 pcc_cpufreq():1 fjes():1 acpi_cpufreq():1
Aug 19 12:22:39 shalem kernel: CPU: 14 PID: 8321 Comm: kworker/u64:57 Tainted: G           OE     -------  ---  6.0.0-0.rc1.20220817git3cc40a443a04.14.fc38.x86_64 #1
Aug 19 12:22:39 shalem kernel: Hardware name: Micro-Star International Co., Ltd. MS-7C95/B550M PRO-VDH WIFI (MS-7C95), BIOS 2.90 12/23/2021
Aug 19 12:22:39 shalem kernel: Workqueue: events_unbound async_run_entry_fn
Aug 19 12:22:39 shalem kernel: RIP: 0010:add_dma_entry+0x1ed/0x270
Aug 19 12:22:39 shalem kernel: Code: ff 0f 84 97 00 00 00 48 8b 5f 50 48 85 db 75 03 48 8b 1f e8 15 96 85 00 48 89 c6 48 89 da 48 c7 c7 30 c9 86 85 e8 73 99 ce 00 <0f> 0b 48 85 ed 0f 85 a9 50 cf 00 8b 05 f2 09 23 02 85 c0 0f 85 ca
Aug 19 12:22:39 shalem kernel: RSP: 0018:ffffa78dc2c0ba20 EFLAGS: 00010282
Aug 19 12:22:39 shalem kernel: RAX: 0000000000000060 RBX: ffff8ac186a636c0 RCX: 0000000000000000
Aug 19 12:22:39 shalem kernel: RDX: 0000000000000001 RSI: ffffffff858d07a4 RDI: 00000000ffffffff
Aug 19 12:22:39 shalem kernel: RBP: ffff8ac184f17a80 R08: 0000000000000001 R09: 0000000000000001
Aug 19 12:22:39 shalem kernel: R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
Aug 19 12:22:39 shalem kernel: R13: 0000000000000001 R14: 0000000000000206 R15: 00000000042f65bd
Aug 19 12:22:39 shalem kernel: FS:  0000000000000000(0000) GS:ffff8ac409a00000(0000) knlGS:0000000000000000
Aug 19 12:22:39 shalem kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 19 12:22:39 shalem kernel: CR2: 0000000000000000 CR3: 000000014c028000 CR4: 0000000000750ee0
Aug 19 12:22:39 shalem kernel: PKRU: 55555554
Aug 19 12:22:39 shalem kernel: Call Trace:
Aug 19 12:22:39 shalem kernel:  <TASK>
Aug 19 12:22:39 shalem kernel:  dma_map_page_attrs+0x82/0x2d0
Aug 19 12:22:39 shalem kernel:  ? _raw_spin_unlock_irqrestore+0x30/0x60
Aug 19 12:22:39 shalem kernel:  usb_hcd_map_urb_for_dma+0x423/0x500
Aug 19 12:22:39 shalem kernel:  usb_hcd_submit_urb+0x9b/0xb80
Aug 19 12:22:39 shalem kernel:  ? lockdep_init_map_type+0x62/0x260
Aug 19 12:22:39 shalem kernel:  usb_start_wait_urb+0x55/0x140
Aug 19 12:22:39 shalem kernel:  usb_control_msg+0xc9/0x120
Aug 19 12:22:39 shalem kernel:  hub_ext_port_status+0x89/0x110
Aug 19 12:22:39 shalem kernel:  hub_activate+0x11c/0x8b0
Aug 19 12:22:39 shalem kernel:  ? usb_dev_thaw+0x10/0x10
Aug 19 12:22:39 shalem kernel:  hub_resume+0x23/0xe0
Aug 19 12:22:39 shalem kernel:  ? usb_dev_thaw+0x10/0x10
Aug 19 12:22:39 shalem kernel:  usb_resume_interface.constprop.0.isra.0+0x89/0xd0
Aug 19 12:22:39 shalem kernel:  usb_resume_both+0x109/0x180
Aug 19 12:22:39 shalem kernel:  ? usb_dev_thaw+0x10/0x10
Aug 19 12:22:39 shalem kernel:  usb_resume+0x15/0x60
Aug 19 12:22:39 shalem kernel:  dpm_run_callback+0x53/0xc0
Aug 19 12:22:39 shalem kernel:  device_resume+0xa8/0x200
Aug 19 12:22:39 shalem kernel:  async_resume+0x19/0x30
Aug 19 12:22:39 shalem kernel:  async_run_entry_fn+0x30/0x130
Aug 19 12:22:39 shalem kernel:  process_one_work+0x2a0/0x600
Aug 19 12:22:39 shalem kernel:  worker_thread+0x4f/0x3a0
Aug 19 12:22:39 shalem kernel:  ? process_one_work+0x600/0x600
Aug 19 12:22:39 shalem kernel:  kthread+0xf5/0x120
Aug 19 12:22:39 shalem kernel:  ? kthread_complete_and_exit+0x20/0x20
Aug 19 12:22:39 shalem kernel:  ret_from_fork+0x22/0x30
Aug 19 12:22:39 shalem kernel:  </TASK>
Aug 19 12:22:39 shalem kernel: irq event stamp: 2115
Aug 19 12:22:39 shalem kernel: hardirqs last  enabled at (2123): [<ffffffff8418fb52>] vprintk_emit+0x342/0x350
Aug 19 12:22:39 shalem kernel: hardirqs last disabled at (2130): [<ffffffff8418fb01>] vprintk_emit+0x2f1/0x350
Aug 19 12:22:39 shalem kernel: softirqs last  enabled at (0): [<ffffffff840f5079>] copy_process+0xa49/0x1e50
Aug 19 12:22:39 shalem kernel: softirqs last disabled at (0): [<0000000000000000>] 0x0
Aug 19 12:22:39 shalem kernel: ---[ end trace 0000000000000000 ]---
Aug 19 12:22:39 shalem kernel: DMA-API: Mapped at:
Aug 19 12:22:39 shalem kernel:  debug_dma_map_page+0x66/0x100
Aug 19 12:22:39 shalem kernel:  dma_map_page_attrs+0x82/0x2d0
Aug 19 12:22:39 shalem kernel:  usb_hcd_map_urb_for_dma+0x423/0x500
Aug 19 12:22:39 shalem kernel:  usb_hcd_submit_urb+0x9b/0xb80
Aug 19 12:22:39 shalem kernel:  usb_start_wait_urb+0x55/0x140


I'm guessing that the data buffer where hub driver stores port status data
was already dma mapped when hub driver submitted a new get port status URB.
usb core tries to map the buffer again at urb subit, triggering this.

Hub driver allocates one data buffer once, and uses it to store both
hub status data and port status data. (union &hub->status->port and &hub->status->hub)

So probably the previous URB (get port status, or get hub status) never successfully
completed, or for some reason didn't unmap dma at completion.

Looks like this issue is hit in resume, so maybe somethin went wrong in completing the
hub URB in suspend?

Was this easy to reproduce? any external hubs connected?

One thing to check would be if usbmon shows an uncompleted hub URB when
a new URB is submitted, triggering this.

Thanks
-Mathias




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux