[ptdma] pt_core_execute_cmd() from interrupt context results in panic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Wondering if this might be a known issue in the ptdma DMA driver. Did
not see anything obvious in bugzilla.

I am doing some testing of the ntb_netdev module in conjunction with
the ptdma module as the supporting DMA engines on an AMD Rome CPU
based platform. The ptdma driver being used is the latest code in the
Linux (6.2) repository.

There are no issues in doing simple ping operations across the
ntb_netdev (TCP/IP) interface, including sending large packets which
we know will cause the respective DMA engines to be utilized. However,
while doing iperf testing across the ntb_netdev interface, we have
encountered a panic:

[ 1626.776583] RIP: 0010:mutex_spin_on_owner+0x3b/0xa0
....
[ 1626.776588] Call Trace:
[ 1626.776588]  <IRQ>
[ 1626.776589]  __mutex_lock.isra.7+0xad/0x4c0
[ 1626.776589]  ? ntb_transport_rx_enqueue+0x127/0x200 [ntb_transport]
[ 1626.776589]  __mutex_lock_slowpath+0x13/0x20
[ 1626.776590]  ? __mutex_lock_slowpath+0x13/0x20
[ 1626.776590]  mutex_lock+0x2f/0x40
[ 1626.776590]  pt_core_perform_passthru+0xc5/0x160 [ptdma]
[ 1626.776591]  pt_cmd_callback.part.7+0x262/0x2d0 [ptdma]
[ 1626.776591]  pt_cmd_callback+0x13/0x20 [ptdma]
[ 1626.776591]  pt_check_status_trans+0xc3/0x120 [ptdma]
[ 1626.776592]  pt_core_irq_handler+0x36/0x60 [ptdma]
[ 1626.776592]  __handle_irq_event_percpu+0x44/0x1a0
[ 1626.776592]  handle_irq_event_percpu+0x32/0x80
[ 1626.776593]  handle_irq_event+0x3b/0x60
[ 1626.776593]  handle_edge_irq+0x83/0x1a0
[ 1626.776593]  handle_irq+0x20/0x30
[ 1626.776593]  do_IRQ+0x50/0xe0
[ 1626.776594]  common_interrupt+0xf/0xf

The issue is that the ptdma handlers are getting called in interrupt
context, and ultimately the flow leads to pt_core_execute_cmd() which
will attempt to grab a mutex, which is really not appropriate in
interrupt context. I have temporarily changed the lock in question to
a spinlock, which seems to have resolved the issue. However, I don't
know enough about the ptdma driver to really know if this is the
desired repair.

Hoping that others with more knowledge in this driver might be able to
comment as to the validity of this bug and whether a spinlock is the
correct approach here. If it is, I would be happy to submit a patch,
otherwise I can just file a bugzilla for the module owner to make a
more appropriate fix.

Thanks for any advice.

Eric Pilmore



[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux PCI]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux