On Saturday 19 April 2008, david@xxxxxxx wrote: > On Sat, 19 Apr 2008, Thomas Gleixner wrote: > > > On Fri, 18 Apr 2008, David Brownell wrote: > >> On Friday 18 April 2008, Woodruff, Richard wrote: > >>> When capturing some traces with dynamic tick we were noticing the > >>> interrupt latency seems to go up a good amount. > >> > >>> I was wondering what thoughts of optimizing this might be. > >> > >> Cutting down the math implied by jiffies updates might help. And update_wall_time() costs, too. > >> The 64 bit math for ktime structs isn't cheap; purely by eyeball, > >> that was almost 1/3 the cost of that 24 usec (mostly __do_div64). > > > > Hmm, I have no real good idea to avoid the div64 in the case of a long > > idle sleep. Any brilliant patches are welcome :) That is, in tick_do_update_jiffies64()? delta = ktime_sub(delta, tick_period); last_jiffies_update = ktime_add(last_jiffies_update, tick_period); /* Slow path for long timeouts */ if (unlikely(delta.tv64 >= tick_period.tv64)) { s64 incr = ktime_to_ns(tick_period); ticks = ktime_divns(delta, incr); last_jiffies_update = ktime_add_ns(last_jiffies_update, incr * ticks); } do_timer(++ticks); Some math not shown here is converting clocksource values to ktimes ... cyc2ns() has a comment about needing some optimization, I wonder if that's an issue here. Maybe turning tick_period into an *actual* constant (it's a function of HZ) would help a bit; "incr" too. Re the "ticks = ktime_divns(...)": since "incr" is constant, the first thing that comes to mind is a binary search over a precomputed table. For HZ=100 (common for ARM) a table of size 128 would exceed the normal range of NO_HZ tick rates ... down to below 1 HZ. > how long is 'long idle sleep'? and how common are such sleeps? The above code says "unlikely()" but that presumes very busy systems. I would have assumed taking more than one tick was the most common case, since most systems spend more time idle than working. I certainly observe it to be the common case, and it's a power management optimization goal. > is it > possibly worth the cost of a test in the hotpath to see if you need to do > the 64 bit math or can get away with 32 bit math (at least on some > platforms) Possibly opening a can of worms, I'll observe that when the concern is just to update jiffies, converting to ktime values seems all but needless. Deltas at the level of a clocksource can be mapped to jiffies as easily as deltas at the nsec level, saving some work... Those delta tables could use just 32 bit values in the most common cases: clocksource ticking at less than 4 GHz, and the IRQs firing more often than once a second. - Dave _______________________________________________ linux-pm mailing list linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/linux-pm