Doug Anderson <dianders@xxxxxxxxxxxx> writes: > Hi, > > On Thu, Apr 20, 2017 at 11:54 AM, Eric Anholt <eric@xxxxxxxxxx> wrote: >> Stefan Wahren <stefan.wahren@xxxxxxxx> writes: >> >>> Hi, >>> >>>> Doug Anderson <dianders@xxxxxxxxxxxx> hat am 18. April 2017 um 22:41 geschrieben: >>>> >>>> >>>> It's hard to know for sure that all of this time is really in >>>> urb_enqueue(). Possible we could have task switched out and been >>>> blocked elsewhere. Using ftrace to get more fine-grained timings >>>> would be useful. ktime_get(), ktime_sub(), and ktime_to_us() are your >>>> friends here if you want to use trace_printk. >>> >>> i'm a newbie to ftrace, so i hope this would be helpful. >>> >>> # connect PL2303 to the onboard hub >>> # echo 0 > options/sleep-time >>> # echo 0 > function_profile_enabled >>> # echo 1 > function_profile_enabled >>> # ./usb_test >>> # Waiting for at least 20 seconds and then disconnect PL2303 >>> # echo 0 > function_profile_enabled >>> # cat trace_stat/function0 >>> >>> Function Hit Time Avg s^2 >>> -------- --- ---- --- --- >>> bcm2835_handle_irq 361347 219567633 us 607.636 us 1485199 us >>> __handle_domain_irq 1082482 212639551 us 196.437 us 3642030 us >>> generic_handle_irq 1082482 100592051 us 92.927 us 50511334 us >>> irq_exit 1082482 98197771 us 90.715 us 29649040 us >>> handle_level_irq 1082482 95812379 us 88.511 us 51910093 us >> >> If I'm reading this output right, we're spending half of our interrupt >> processing time in irq_exit(), so even if dwc2's interrupt was free (the >> generic_handle_irq() chain), we'd be eating about half the CPU getting >> back out of the interrupt handler, right? >> >> I don't really know anything about DWC2 or USB, but is there any way we >> could mitigate the interrupt frequency with this hardware? If nothing >> else, could we loop reading gintsts until it reads back 0? > > Take ftrace with a little bit of a grain of salt, especially on older > / slower ARMs (without the arch timer). Whenever ftrace takes a log > it grabs a timestamp. This can be an expensive (ish) operation. Even > on newer CPUs it's still not free if you call it as much as ftrace, > but on older CPUs it's extra expensive. If per-function timestamp cost was the problem, shouldn't I expect to see a bunch of irq_exit()'s children each taking a bit of time? We have a long callchain with the functions each taking a bit of time in the dwc2 interrupt handler, but irq_exit() seems to be a monolithic cost.
Attachment:
signature.asc
Description: PGP signature