On Wed, Dec 13, 2017 at 11:54:56AM +0100, Greg KH wrote: > On Wed, Dec 13, 2017 at 11:11:16AM +0100, Rabin Vincent wrote: > > From: Rabin Vincent <rabinv@xxxxxxxx> > > > > softirq time accounting is broken on v4.9.x if ksoftirqd runs. > > > > With > > CONFIG_IRQ_TIME_ACCOUNTING=y > > # CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set > > > > this test code: > > > > struct tasklet_struct tasklet; > > > > static void delay_tasklet(unsigned long data) > > { > > udelay(10); > > tasklet_schedule(&tasklet); > > } > > > > tasklet_init(&tasklet, delay_tasklet, 0); > > tasklet_schedule(&tasklet); > > > > results in: > > > > $ while :; do grep cpu0 /proc/stat; done > > cpu0 5 0 80 25 16 107 1 0 0 0 > > cpu0 5 0 80 25 16 107 0 0 0 0 > > cpu0 5 0 80 25 16 107 0 0 0 0 > > cpu0 5 0 80 25 16 107 0 0 0 0 > > cpu0 5 0 81 25 16 107 0 0 0 0 > > cpu0 5 0 81 25 16 107 1 0 0 0 > > cpu0 5 0 81 25 16 108 18446744073709551615 0 0 0 > > cpu0 5 0 81 25 16 108 18446744073709551615 0 0 0 > > cpu0 5 0 81 25 16 108 18446744073709551615 0 0 0 > > cpu0 5 0 81 25 16 108 0 0 0 0 > > cpu0 6 0 81 25 16 108 0 0 0 0 > > cpu0 6 0 81 25 16 108 0 0 0 0 > > > > As can be seen, the softirq numbers are totally bogus. > > > > When ksoftirq is running, irqtime_account_process_tick() increments > > cpustat[CPUSTAT_SOFTIRQ]. This causes the "nsecs_to_cputime64(irqtime) > > - cpustat[CPUSTAT_SOFTIRQ]" calculation in irqtime_account_update() to > > underflow the next time a softirq is handled leading to the above > > values. > > > > The underflow bug was added by 57430218317e5b280 ("sched/cputime: Count > > actually elapsed irq & softirq time"). > > > > But ksoftirqd accounting was wrong even in earlier kernels. In earlier > > kernels, after a ksoftirq run, the kernel would simply stop accounting > > softirq time spent outside of ksoftirqd until that (accumulated) time > > exceeded the time for which ksofirqd previously had run. > > > > Fix both the underflow and the wrong accounting by using a counter > > specifically for the non-ksoftirqd softirq case. > > > > This code has been fixed in current mainline by a499a5a14db > > ("sched/cputime: Increment kcpustat directly on irqtime account") [note > > also the followup 25e2d8c1b9e327e ("sched/cputime: Fix ksoftirqd cputime > > accounting regression")], but that patch is a part of the many changes > > for eliminating of cputime_t so it does not seem suitable for backport. > > I _really_ only want to take the exact upstream patches, as every time > we do something like what you are proposing to do here, we get it wrong. > Seriously, our track record is horrible. Like 90% wrong. > > Can you try just those two patches instead? Dropping from my mbox due to a lack of response. If you do test this out, please resend. thanks, greg k-h