[RFC] timers: Don't wake ktimersoftd on every tick

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We recently upgraded from 4.1 to 4.6 and noticed a minor latency
regression caused by an additional thread wakeup (ktimersoftd) in
interrupt context on every tick. The wakeups are from
run_local_timers() raising TIMER_SOFTIRQ. Both TIMER and SCHED softirq
coalesced into one ksoftirqd wakeup prior to Sebastian's change to split
timers into their own thread (c002d3c0).

There's already logic in run_local_timers() to avoid some unnecessary
wakeups of ksoftirqd, but it doesn't seems to catch them all. In
particular, I've seen many unnecessary wakeups when jiffies increments
prior to run_local_timers().

This patch attempts to improve things by servicing timers in interrupt
context until one or more are ready to fire:

 * Increments base->clk to keep up with jiffies in interrupt context
 * Wakes ktimersoftd only when pending_map shows one or more ready
   timers

`cyclictest -p 98 -m` (4h run, no cpu affinity) shows ~5-10% decrease in
22-70us latency range, but a ~21% increase 20-21us and mixed results
 >70us. No change in max jitter. No change when `-S` is used.

[Traces]

ftp://ftp.ni.com/outgoing/tp01-timer-peek-traces.tgz
(Email me if link dies. Server periodically purges old files.)

Prior to this change, event traces of an idle system (before.dat) show
TIMER softirq_raise()s each tick without subsequent timer_expire_entries()s.
I.e. ktimesoftd woke to perform no work on both CPUs. after.dat only shows
TIMER softirq_raise()s only when followed by one or more expiring timers.

[Hardware/software/config]

 NI cRIO-9033
  2 core Intel Atom CPU

 Kernel 4.8.6-rt5
  CONFIG_HZ_PERIODIC=y

[Concerns/Issues/questions]

I'm relatively new to the timer subsystem, so please feel free to poke
as many holes as possible in this change. A few things that concern me
at the moment are:

Can jiffies change while one or more cpus is inside tick_sched_timer(),
 in interrupt context? I'm copying jiffies to a local variable in
 has_expired_timers() to ensure it doesn't run unbounded, but I'm not
 sure if that's necessary.

There may be a faster way to query pending_map. I copied the current
 algorithm from __collect_expired_timers(), which performs a linear
 search of each level for each clk, up to jiffies+HZ.

Special considerations for testing NO_HZ builds?

Other performance benchmark I should run?

Source: https://github.com/harisokanovic/linux/tree/dev/hokanovi/timer-peek

Thanks,
Haris
---
 kernel/time/timer.c | 47 +++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 45 insertions(+), 2 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index ba53447..c6241ac 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1374,6 +1374,49 @@ static void expire_timers(struct timer_base *base, struct hlist_head *head)
 	}
 }
 
+/* Called in interrupt context to quickly check if one or
+ * more timers are expired in base
+ */
+static bool has_expired_timers(struct timer_base *base)
+{
+	bool res = false;
+	unsigned long clk, end_clk, idx;
+	int i;
+
+	raw_spin_lock(&base->lock);
+
+	end_clk = jiffies;
+	if (unlikely(time_after(end_clk, base->clk + HZ))) {
+		/* Don't spend too much time in interrupt context. If we didn't
+		 * service timers in the last second, defer to ktimersoftd.
+		 */
+		res = true;
+		goto end;
+	}
+
+	for ( ; base->clk <= end_clk; base->clk++) {
+		clk = base->clk;
+		for (i = 0; i < LVL_DEPTH; i++) {
+			idx = (clk & LVL_MASK) + i * LVL_SIZE;
+
+			if (test_bit(idx, base->pending_map)) {
+				res = true;
+				goto end;
+			}
+
+			/* Is it time to look at the next level? */
+			if (clk & LVL_CLK_MASK)
+				break;
+			/* Shift clock for the next level granularity */
+			clk >>= LVL_CLK_SHIFT;
+		}
+	}
+
+	end:
+	raw_spin_unlock(&base->lock);
+	return res;
+}
+
 static int __collect_expired_timers(struct timer_base *base,
 				    struct hlist_head *heads)
 {
@@ -1686,12 +1729,12 @@ void run_local_timers(void)
 
 	hrtimer_run_queues();
 	/* Raise the softirq only if required. */
-	if (time_before(jiffies, base->clk)) {
+	if (time_before(jiffies, base->clk) || !has_expired_timers(base)) {
 		if (!IS_ENABLED(CONFIG_NO_HZ_COMMON) || !base->nohz_active)
 			return;
 		/* CPU is awake, so check the deferrable base. */
 		base++;
-		if (time_before(jiffies, base->clk))
+		if (time_before(jiffies, base->clk) || !has_expired_timers(base))
 			return;
 	}
 	raise_softirq(TIMER_SOFTIRQ);
-- 
2.10.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux