On 07/01/13 13:14, Thomas Gleixner wrote: > The issue is very subtle. What happens is: > > CPU0 CPU1 > > Switch to oneshot mode > > Copy the bits from tick_broadcast_mask to > tick_broadcast_oneshot_mask. We need to do > that so the other cpus reach the timer irq > and the softirq which switches them to > oneshot. > > Kick the broadcast device into oneshot. > > Timer interrupt fires > > irq_enter sees the cpu in > tick_broadcast_oneshot_mask and > sets the device to oneshot mode > > handle_periodic: > Sees oneshot mode and adds > period to > dev->next_event(KTIME_MAX) > Yep. It is also racing with the timer interrupt so having more than two CPUs must help widen the window (which is why we see it on the higher numbered CPUs). > > So we need two fixes: > > 1) The replacement of the dummy timer and the effect on the broadcast > mask/device > > 2) tick_check_oneshot_broadcast needs a sanity check > > Patch below. Looks good. Reviewed-by: Stephen Boyd <sboyd@xxxxxxxxxxxxxx> One minor typo in the comment below. > + switch (tick_broadcast_device.mode) { > + case TICKDEV_MODE_ONESHOT: > + /* > + * If the system is in oneshot mode we can > + * unconditionally clear the oneshot mask, > + * because the CPU is running and therefor not s/therefor/therefore/ -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line "unsubscribe linux-next" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html