On Fri, Sep 07, 2018 at 12:16:34PM +0200, Peter Zijlstra wrote: > This does the whole seqcount thing 6x, which is a bit of a waste. > > struct snapshot { > u32 times[NR_PSI_STATES]; > }; > > static inline struct snapshot get_times_snapshot(struct psi_group *pg, int cpu) > { > struct pci_group_cpu *pgc = per_cpu_ptr(pg->pcpu, cpu); > struct snapshot s; > unsigned int seq; > u32 delta; > int i; > > do { > seq = read_seqcount_begin(&pgc->seq); > > delta = cpu_clock(cpu) - pgc->state_start; > for (i = 0; i < NR_PSI_STATES; i++) { > s.times[i] = gpc->times[i]; > if (test_state(pgc->tasks, i)) > s.times[i] += delta; > } > > } while (read_seqcount_retry(&pgc->seq, seq); Sorry, I forgot the whole times_prev thing: for (i = 0; i < NR_PSI_STATES; i++) { tmp = s.times[i]; s.times[i] -= pgc->times_prev[i]; pgc->times_prev[i] = tmp; } > return s; > } > > > for_each_possible_cpu(cpu) { > struct snapshot s = get_times_snapshot(pg, cpu); > > nonidle = nsecs_to_jiffies(s.times[PSI_NONIDLE]); > nonidle_total += nonidle; > > for (i = 0; i < PSI_NONIDLE; i++) > deltas[s] += (u64)s.times[i] * nonidle; > > /* ... */ > > } > > > It's a bit cumbersome, but that's because of C.