On Fri, 26 May 2017, Haris Okanovic wrote: > Oh crap. I think I see the problem. I decrement expired_count before > processing the list. Dropping the lock permits another run of > tick_find_expired()->find_expired_timers() in the middle of __expire_timers() > since it uses expired_count==0 as a condition. > > This should fix it, but I'll wait for Anna-Maria's test next week before > submitting a patch. > > > static void expire_timers(struct timer_base *base) > > { > > struct hlist_head *head; > > + int expCount = base->expired_count; No camel case for heavens sake! And this requires: cnt = READ_ONCE(base->expired_count); > > - while (base->expired_count--) { > > - head = base->expired_lists + base->expired_count; > > + while (expCount--) { > > + head = base->expired_lists + expCount; > > __expire_timers(base, head); > > } Plus a comment. > > base->expired_count = 0; Anna-Maria spotted the same issue, but I voted for the revert right now because I was worried about the consistency of base->clk under all circumstances. The other thing I noticed was this weird condition which does not do the look ahead when base->clk is back for some time. Why don't you use the existing optimization which uses the bitmap for fast forward? The other issue I have is that this can race at all. If you raised the softirq in the look ahead then you should not go into that function until the softirq has actually completed. There is no point in wasting time in the hrtimer interrupt if the softirq is running anyway. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html