Re: [RFC] vmstat: Avoid waking up idle-cpu to service shepherd work

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 30, 2015 at 05:32:16PM +0530, Viresh Kumar wrote:
> On 29 March 2015 at 15:54, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> > What I didn't say, but had thought of is that __run_timer() should skip
> > any timer that has RUNNING set -- for obvious reasons :-)

> Below is copied from your first reply, and so you probably already
> said that ? :)
> 
> > Also, once you have tbase_running, we can take base->running_timer out
> > altogether.

No, I means something else with that. We can remove the
tvec_base::running_timer field. Everything that uses that can use
tbase_running() AFAICT.

> I wanted to clarify if I understood it correctly..
> 
> Are you saying that:

> Case 2.) we keep retrying for it, until the time the other handler finishes?

That.

If we remove it from the list before we call ->fn. Therefore, even if
migrate happens, it will not see a RUNNING timer entry, seeing how its
not actually on any lists.

The only way to get on a list while running is if ->fn() requeues itself
_on_another_base_. When that happens, we need to wait for it to complete
running.

> Case 2.) We kept waiting for the first handler to finish ..
> - cpuY may waste some cycles as it kept waiting for handler to finish on cpuX ..

True, rather silly to requeue a timer on the same jiffy as its already
running through, but yes, an unlikely possibility.

You can run another timer while we wait -- if there is another of
course.

> - We may need to perform base unlock/lock on cpuY, so that cpuX can take cpuY's
> lock to reset tbase_running. And that might be racy, not sure.

Drop yes, racy not so much I think.


diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 2d3f5c504939..1394f9540348 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1189,12 +1189,39 @@ static inline void __run_timers(struct tvec_base *base)
 			cascade(base, &base->tv5, INDEX(3));
 		++base->timer_jiffies;
 		list_replace_init(base->tv1.vec + index, head);
+
+again:
 		while (!list_empty(head)) {
 			void (*fn)(unsigned long);
 			unsigned long data;
 			bool irqsafe;
 
-			timer = list_first_entry(head, struct timer_list,entry);
+			timer = list_first_entry(head, struct timer_list, entry);
+			if (unlikely(tbase_running(timer))) {
+				/* Only one timer on the list, force wait. */
+				if (unlikely(head->next == head->prev)) {
+					spin_unlock(&base->lock);
+
+					/*
+					 * The only way to get here is if the
+					 * handler requeued itself on another
+					 * base, this guarantees the timer will
+					 * not go away.
+					 */
+					while (tbase_running(timer))
+						cpu_relax();
+
+					spin_lock(&base->lock);
+				} else  {
+					/*
+					 * Otherwise, rotate the list and try
+					 * someone else.
+					 */
+					list_move_tail(&timer->entry, head);
+				}
+				goto again;
+			}
+
 			fn = timer->function;
 			data = timer->data;
 			irqsafe = tbase_get_irqsafe(timer->base);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]