On Mon, Mar 30, 2015 at 05:32:16PM +0530, Viresh Kumar wrote: > On 29 March 2015 at 15:54, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > What I didn't say, but had thought of is that __run_timer() should skip > > any timer that has RUNNING set -- for obvious reasons :-) > Below is copied from your first reply, and so you probably already > said that ? :) > > > Also, once you have tbase_running, we can take base->running_timer out > > altogether. No, I means something else with that. We can remove the tvec_base::running_timer field. Everything that uses that can use tbase_running() AFAICT. > I wanted to clarify if I understood it correctly.. > > Are you saying that: > Case 2.) we keep retrying for it, until the time the other handler finishes? That. If we remove it from the list before we call ->fn. Therefore, even if migrate happens, it will not see a RUNNING timer entry, seeing how its not actually on any lists. The only way to get on a list while running is if ->fn() requeues itself _on_another_base_. When that happens, we need to wait for it to complete running. > Case 2.) We kept waiting for the first handler to finish .. > - cpuY may waste some cycles as it kept waiting for handler to finish on cpuX .. True, rather silly to requeue a timer on the same jiffy as its already running through, but yes, an unlikely possibility. You can run another timer while we wait -- if there is another of course. > - We may need to perform base unlock/lock on cpuY, so that cpuX can take cpuY's > lock to reset tbase_running. And that might be racy, not sure. Drop yes, racy not so much I think. diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 2d3f5c504939..1394f9540348 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1189,12 +1189,39 @@ static inline void __run_timers(struct tvec_base *base) cascade(base, &base->tv5, INDEX(3)); ++base->timer_jiffies; list_replace_init(base->tv1.vec + index, head); + +again: while (!list_empty(head)) { void (*fn)(unsigned long); unsigned long data; bool irqsafe; - timer = list_first_entry(head, struct timer_list,entry); + timer = list_first_entry(head, struct timer_list, entry); + if (unlikely(tbase_running(timer))) { + /* Only one timer on the list, force wait. */ + if (unlikely(head->next == head->prev)) { + spin_unlock(&base->lock); + + /* + * The only way to get here is if the + * handler requeued itself on another + * base, this guarantees the timer will + * not go away. + */ + while (tbase_running(timer)) + cpu_relax(); + + spin_lock(&base->lock); + } else { + /* + * Otherwise, rotate the list and try + * someone else. + */ + list_move_tail(&timer->entry, head); + } + goto again; + } + fn = timer->function; data = timer->data; irqsafe = tbase_get_irqsafe(timer->base); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>