On Wed, Aug 01, 2018 at 11:19:57AM -0400, Johannes Weiner wrote: > +static bool psi_update_stats(struct psi_group *group) > +{ > + u64 deltas[NR_PSI_STATES - 1] = { 0, }; > + unsigned long missed_periods = 0; > + unsigned long nonidle_total = 0; > + u64 now, expires, period; > + int cpu; > + int s; > + > + mutex_lock(&group->stat_lock); > + > + /* > + * Collect the per-cpu time buckets and average them into a > + * single time sample that is normalized to wallclock time. > + * > + * For averaging, each CPU is weighted by its non-idle time in > + * the sampling period. This eliminates artifacts from uneven > + * loading, or even entirely idle CPUs. > + * > + * We don't need to synchronize against CPU hotplugging. If we > + * see a CPU that's online and has samples, we incorporate it. > + */ > + for_each_online_cpu(cpu) { I'm still puzzled by this.. for 99% of the machines online == possible. Why not always iterate possible and leave it at that? This is hardly a fast path.