I has been some time, but I have an update to share. On Sun, May 17, 2015 at 8:06 AM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > You might be able to learn more from ftrace. See the instructions in > Documentation/trace/ftrace.txt. The "irqsoff" tracer may be the best > one to try. I used ftrace with the irqoff tracer as recommended. irqoff seems to have some known issues on ARM and it traces the idle process, however, I was able to get some meaningful output when running this tracer for a very short time. I ran the function_graph tracer as well, which is quite informative, but it also adds a lot of overhead. On Mon, May 18, 2015 at 2:27 AM, Matthieu CASTET <matthieu.castet@xxxxxxxxxx> wrote: > last time I checked uvc driver, it was doing memcpy of video buffer : see uvc_video_decode_data. I found out Matthieu was on the right track and most of the time is indeed spent in uvc_video_decode_data(), which uses memcpy() to copy the URBs to the video buffers. For 1280x960px 15fps single 8-bit channel, uvc_video_decode_data() seems to consume between 40% and 50%. What is still a mystery to me is why I see relatively high CPU usage even for a very tiny resolution. I was able to confirm that it is not uvc_video_decode_data() in that case, however, given function_graph overhead and irqoff being partially broken I was unable to figure out much more. I attached some tracer outputs at the end. On Mon, May 18, 2015 at 8:59 AM, Tony Lindgren <tony@xxxxxxxxxxx> wrote: > For the memcpy part with DMA this patch should help too: > > [PATCH] dmaengine: omap-dma: Add support for memcpy The patch should help, but not out of the box as far as I understand. My understanding is that to take advantage of this, it would be necessary to explicitly call the DMA API either by replacing memcpy() in uvc_video_decode_data() or by modyfying the memcpy() implementation. Is that correct? ******************************************* /sys/kernel/debug/tracing# echo 0 > tracing_max_latency && echo 1 > tracing_on && usleep 10 && echo 0 > tracing_on # tracer: irqsoff # # irqsoff latency trace v1.1.5 on 3.17.8-custom # -------------------------------------------------------------------- # latency: 579 us, #26/26, CPU#0 | (M:server VP:0, KP:0, SP:0 HP:0 #P:1) # ----------------- # | task: sshd-2980 (uid:0 nice:0 policy:0 rt_prio:0) # ----------------- # => started at: __usb_hcd_giveback_urb # => ended at: __usb_hcd_giveback_urb # # # _------=> CPU# # / _-----=> irqs-off # | / _----=> need-resched # || / _---=> hardirq/softirq # ||| / _--=> preempt-depth # |||| / delay # cmd pid ||||| time | caller # \ / ||||| \ | / sshd-2980 0d.s. 0us : trace_hardirqs_on <-tasklet_action sshd-2980 0d.s. 0us : time_hardirqs_on <-tasklet_action sshd-2980 0d.s. 0us : trace_hardirqs_off <-_raw_spin_lock_irq sshd-2980 0d.s. 0us : trace_hardirqs_on <-_raw_spin_unlock_irq sshd-2980 0d.s. 0us : time_hardirqs_on <-_raw_spin_unlock_irq sshd-2980 0d.s. 0us+: trace_hardirqs_off <-__usb_hcd_giveback_urb sshd-2980 0d.s. 31us : uvc_video_complete <-__usb_hcd_giveback_urb sshd-2980 0d.s. 31us : uvc_video_decode_bulk <-uvc_video_complete sshd-2980 0d.s. 31us : uvc_video_decode_start <-uvc_video_decode_bulk sshd-2980 0d.s. 31us : ktime_get_ts64 <-uvc_video_decode_start sshd-2980 0d.s. 31us : clocksource_mmio_readl_up <-ktime_get_ts64 sshd-2980 0d.s. 31us!: uvc_video_decode_data.isra.17 <-uvc_video_decode_bulk sshd-2980 0d.s. 519us : usb_submit_urb <-uvc_video_complete sshd-2980 0d.s. 519us : usb_hcd_submit_urb <-usb_submit_urb sshd-2980 0d.s. 519us : usb_get_urb <-usb_hcd_submit_urb sshd-2980 0d.s. 519us : usb_hcd_map_urb_for_dma <-usb_hcd_submit_urb sshd-2980 0d.s. 519us+: ehci_urb_enqueue <-usb_hcd_submit_urb sshd-2980 0d.s. 549us : qh_urb_transaction <-ehci_urb_enqueue sshd-2980 0d.s. 549us : ehci_qtd_alloc.isra.71 <-qh_urb_transaction sshd-2980 0d.s. 549us : dma_pool_alloc <-ehci_qtd_alloc.isra.71 sshd-2980 0d.s. 549us : qtd_fill.isra.54 <-qh_urb_transaction sshd-2980 0d.s. 549us : usb_hcd_link_urb_to_ep <-ehci_urb_enqueue sshd-2980 0d.s. 549us+: qh_append_tds <-ehci_urb_enqueue sshd-2980 0d.s. 580us : trace_hardirqs_on <-__usb_hcd_giveback_urb sshd-2980 0d.s. 580us+: time_hardirqs_on <-__usb_hcd_giveback_urb sshd-2980 0d.s. 671us : <stack trace> => trace_hardirqs_on => __usb_hcd_giveback_urb => usb_giveback_urb_bh => tasklet_action => __do_softirq => irq_exit => handle_IRQ => omap3_intc_handle_irq => __irq_usr => ******************************************* ******************************************* # tracer: function_graph # # TIME DURATION FUNCTION CALLS # | | | | | | | 5853.350952 0.000 us | do_raw_spin_lock(); 5853.350952 0.000 us | do_raw_spin_unlock(); 5853.350952 | __usb_hcd_giveback_urb() { 5853.350952 | unmap_urb_for_dma() { 5853.350982 | usb_hcd_unmap_urb_for_dma() { 5853.350982 0.000 us | usb_hcd_unmap_urb_setup_for_dma(); 5853.350982 0.000 us | } /* usb_hcd_unmap_urb_for_dma */ 5853.350982 + 30.517 us | } /* unmap_urb_for_dma */ 5853.350982 0.000 us | usb_anchor_suspend_wakeups(); 5853.350982 0.000 us | usb_unanchor_urb(); 5853.351013 | uvc_video_complete [uvcvideo]() { 5853.351013 0.000 us | do_raw_spin_lock(); 5853.351013 0.000 us | do_raw_spin_unlock(); 5853.351043 | uvc_video_decode_bulk [uvcvideo]() { 5853.351043 ! 457.763 us | uvc_video_decode_data.isra.17 [uvcvideo](); 5853.351501 ! 457.763 us | } /* uvc_video_decode_bulk [uvcvideo] */ 5853.351501 | usb_submit_urb() { 5853.351501 | usb_hcd_submit_urb() { 5853.351501 0.000 us | usb_get_urb(); 5853.351501 0.000 us | usb_hcd_map_urb_for_dma(); 5853.351531 | ehci_urb_enqueue() { 5853.351531 | qh_urb_transaction() { 5853.351531 | ehci_qtd_alloc.isra.71() { 5853.351531 | dma_pool_alloc() { 5853.351531 0.000 us | do_raw_spin_lock(); 5853.351562 0.000 us | do_raw_spin_unlock(); 5853.351562 + 30.517 us | } /* dma_pool_alloc */ 5853.351562 + 30.517 us | } /* ehci_qtd_alloc.isra.71 */ 5853.351562 0.000 us | qtd_fill.isra.54(); 5853.351562 + 30.517 us | } /* qh_urb_transaction */ 5853.351593 0.000 us | do_raw_spin_lock(); 5853.351593 | usb_hcd_link_urb_to_ep() { 5853.351593 | _raw_spin_lock() { 5853.351593 0.000 us | do_raw_spin_lock(); 5853.351593 0.000 us | } /* _raw_spin_lock */ 5853.351623 | _raw_spin_unlock() { 5853.351623 0.000 us | do_raw_spin_unlock(); 5853.351623 0.000 us | } /* _raw_spin_unlock */ 5853.351623 + 30.518 us | } /* usb_hcd_link_urb_to_ep */ 5853.351623 0.000 us | qh_append_tds(); 5853.351654 0.000 us | do_raw_spin_unlock(); 5853.351654 ! 122.070 us | } /* ehci_urb_enqueue */ 5853.351654 ! 152.588 us | } /* usb_hcd_submit_urb */ 5853.351654 ! 152.588 us | } /* usb_submit_urb */ 5853.351654 ! 640.869 us | } /* uvc_video_complete [uvcvideo] */ 5853.351654 | handle_IRQ() { 5853.351715 0.000 us | idle_cpu(); 5853.351715 + 30.517 us | } /* handle_IRQ */ 5853.351715 0.000 us | usb_anchor_resume_wakeups(); 5853.351715 + 30.518 us | usb_free_urb(); 5853.351745 ! 793.457 us | } /* __usb_hcd_giveback_urb */ ******************************************* Tom -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html