On Mon, 1 Jul 2013, Stephen Boyd wrote: > On 07/01/13 13:14, Thomas Gleixner wrote: > > The issue is very subtle. What happens is: > > > > CPU0 CPU1 > > > > Switch to oneshot mode > > > > Copy the bits from tick_broadcast_mask to > > tick_broadcast_oneshot_mask. We need to do > > that so the other cpus reach the timer irq > > and the softirq which switches them to > > oneshot. > > > > Kick the broadcast device into oneshot. > > > > Timer interrupt fires > > > > irq_enter sees the cpu in > > tick_broadcast_oneshot_mask and > > sets the device to oneshot mode > > > > handle_periodic: > > Sees oneshot mode and adds > > period to > > dev->next_event(KTIME_MAX) > > > > Yep. It is also racing with the timer interrupt so having more than two > CPUs must help widen the window (which is why we see it on the higher > numbered CPUs). The race above is about the timer interrupt. You mean the broadcast one which is still enabled due to the dummy -> functional transition issue, right? That helps a lot to make this more visible, because we double the number of events. > > + * because the CPU is running and therefor not > > s/therefor/therefore/ Duh. That one haunts me forever. /me goes off to split the patch into two separate fixes, add proper changelogs and wait for Vincents confirmation. I really wish, that x86 would have been the only architecture which made use of that broadcast nonsense. Though the ARM folks went there and created the same mess as x86 but raised to the power of N, where N = Number of odd ARM chips designed by morons who thought that copying the already publicly documented idiocy of x86 is a brilliant idea. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-next" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html