On Mon, Dec 10, 2018 at 12:30:23PM +0100, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> > > Add two new metrics for CPU idle states, "above" and "below", to count > the number of times the given state had been asked for (or entered > from the kernel's perspective), but the observed idle duration turned > out to be too short or too long for it (respectively). > > These metrics help to estimate the quality of the CPU idle governor > in use. > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> > @@ -260,6 +262,33 @@ int cpuidle_enter_state(struct cpuidle_d > dev->last_residency = (int)diff; > dev->states_usage[entered_state].time += dev->last_residency; > dev->states_usage[entered_state].usage++; > + > + if (diff < drv->states[entered_state].target_residency) { > + for (i = entered_state - 1; i >= 0; i--) { > + if (drv->states[i].disabled || > + dev->states_usage[i].disable) > + continue; > + > + /* Shallower states are enabled, so update. */ > + dev->states_usage[entered_state].above++; > + break; > + } > + } else if (diff > delay) { > + for (i = entered_state + 1; i < drv->state_count; i++) { > + if (drv->states[i].disabled || > + dev->states_usage[i].disable) > + continue; > + > + /* > + * Update if a deeper state would have been a > + * better match for the observed idle duration. > + */ > + if (diff - delay >= drv->states[i].target_residency) > + dev->states_usage[entered_state].below++; > + > + break; > + } > + } One question on this; why is this tracked unconditionally? Would not a tracepoint be better?; then there is no overhead in the normal case where nobody gives a crap about these here numbers.