On 9/13/23 10:47, Tim Blechmann wrote: > many thanks for your analysis, very interesting. > >> please find below an extract of your trace with my analysis: >> >> >> stm32mp1_bulk_p-390 [001] ..... 907.241226: rpmsg_send >> <-rpmsg_intercore_send_buffer.constprop.0 >> stm32mp1_bulk_p-390 [001] ..... 907.241228: virtio_rpmsg_send >> <-rpmsg_send >> stm32mp1_bulk_p-390 [001] ..... 907.241237: virtqueue_enable_cb >> <-rpmsg_send_offchannel_raw >> stm32mp1_bulk_p-390 [001] ..... 907.241239: virtqueue_enable_cb_prepare >> >> At this point seems that no more TX-buffer >> >> <-rpmsg_recv_single >> kworker/0:4-67 [000] ..... 907.242533: vring_interrupt >> <-rproc_vq_interrupt >> kworker/0:4-67 [000] ..... 907.242536: rpmsg_xmit_done >> >> Here you receive an interrupt indicating that TX buffer has been released by >> the remote. that's the expected behavior. >> >> >> <-rpmsg_send_offchannel_raw >> stm32mp1_bulk_p-390 [000] ..... 984.054941: rpmsg_send >> <-rpmsg_intercore_send_buffer.constprop.0 >> stm32mp1_bulk_p-390 [000] ..... 984.054943: virtio_rpmsg_send >> <-rpmsg_send >> stm32mp1_bulk_p-390 [000] ..... 984.054956: virtqueue_enable_cb >> <-rpmsg_send_offchannel_raw >> stm32mp1_bulk_p-390 [000] ..... 984.054958: virtqueue_enable_cb_prepare >> <-virtqueue_enable_cb >> stm32mp1_bulk_p-390 [000] ..... 999.398667: virtqueue_disable_cb >> <-rpmsg_send_offchannel_raw >> stm32mp1_bulk_p-390 [000] ..... 999.414840: rpmsg_send >> <-rpmsg_intercore_send_buffer.constprop.0 >> stm32mp1_bulk_p-390 [000] ..... 999.414843: virtio_rpmsg_send >> <-rpmsg_send >> stm32mp1_bulk_p-390 [000] ..... 999.414855: virtqueue_enable_cb >> <-rpmsg_send_offchannel_raw >> stm32mp1_bulk_p-390 [000] ..... 999.414857: virtqueue_enable_cb_prepare >> >> Here you have again no more TX buffer. From this point there is no more activity >> neither in TX nor in RX until the timeout of 15 seconds. >> If you have a look to rproc_vq_interrupt the last one occurs at 907.242533 >> >> >> As you have no more virtqueue interrupts call for both directions, the issue is >> probably either in the Cortex-M firmware, which seems to be stalled, or due to a >> disable of the IRQs in Linux. > > afaict we can rule out a complete stall of the cortex-m firmware: if we change > the rpmsg_send to a rpmsg_trysend/msleep loop, the trysend will succeed to get a > buffer after a few iterations. > >> or due to a disable of the IRQs in Linux. > > do you have some recommendations how we could trace this? First, check if the Cortex-M sends the IPCC mailbox notification during the 15-second period. Then, you can verify the IPCC registers to check if some flags are pending. On the Linux side, there is probably literature available to explain how to trace it. > > many thanks, > tim > >> <-virtqueue_enable_cb >> stm32mp1_bulk_p-390 [000] ..... 1014.758678: virtqueue_disable_cb >> <-rpmsg_send_offchannel_raw >> stm32mp1_bulk_p-390 [000] ..... 1014.774802: rpmsg_send >> <-rpmsg_intercore_send_buffer.constprop.0 >> stm32mp1_bulk_p-390 [000] ..... 1014.774804: virtio_rpmsg_send >> <-rpmsg_send >> stm32mp1_bulk_p-390 [000] ..... 1014.774815: virtqueue_enable_cb > >