On Fri, 2007-08-24 at 13:57 -0400, Steven Rostedt wrote: > The latency tracer on SMP was given crazy results. It was found that the > get_monotonic_cycles that it uses was not returning a monotonic counter. > The cause of this was that clock->cycles_raw and clock->cycles_last can > be updated on another CPU and make the cycles_now variable out-of-date. > So the delta that was calculated from cycles_now - cycles_last was > incorrect. > > This patch adds a loop to make sure that the cycles_raw and cycles_last > are consistent through out the calculation (otherwise it performs the > loop again). > > With this patch the latency_tracer can produce normal results again. Ah! good catch. I totally missed that get_monotonic_cycles was being called outside of the xtime lock. > Signed-off-by: Steven Rostedt <rostedt@xxxxxxxxxxx> > > Index: linux-2.6-rt/kernel/time/timekeeping.c > =================================================================== > --- linux-2.6-rt.orig/kernel/time/timekeeping.c 2007-08-24 11:41:04.000000000 -0400 > +++ linux-2.6-rt/kernel/time/timekeeping.c 2007-08-24 11:47:01.000000000 -0400 > @@ -75,15 +75,30 @@ s64 __get_nsec_offset(void) > > cycle_t notrace get_monotonic_cycles(void) > { > - cycle_t cycle_now, cycle_delta; > + cycle_t cycle_now, cycle_delta, cycle_raw, cycle_last; > > - /* read clocksource: */ > - cycle_now = clocksource_read(clock); > + do { > + /* > + * cycle_raw and cycle_last can change on > + * another CPU and we need the delta calculation > + * of cycle_now and cycle_last happen atomic, as well > + * as the adding to cycle_raw. We don't need to grab > + * any locks, we just keep trying until get all the > + * calculations together in one state. > + */ > + cycle_raw = clock->cycle_raw; > + cycle_last = clock->cycle_last; > + > + /* read clocksource: */ > + cycle_now = clocksource_read(clock); > + > + /* calculate the delta since the last update_wall_time: */ > + cycle_delta = (cycle_now - cycle_last) & clock->mask; > > - /* calculate the delta since the last update_wall_time: */ > - cycle_delta = (cycle_now - clock->cycle_last) & clock->mask; > + } while (cycle_raw != clock->cycle_raw || > + cycle_last != clock->cycle_last); So if I'm understanding this right, not taking a lock isn't an optimization (as the seq read lock code is almost the same), but a requirement (as this might be called while xtime_lock is held), correct? Might want to clarify that a bit in the comment. > - return clock->cycle_raw + cycle_delta; > + return cycle_raw + cycle_delta; > } > > unsigned long notrace cycles_to_usecs(cycle_t cycles) Otherwise: Acked-by: John Stultz <johnstul@xxxxxxxxxx> thanks -john - To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html