Re: [tip:timers/urgent] timers: Unconditionally check deferrable base

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 15 Jan 2018, Paul E. McKenney wrote:

> On Sun, Jan 14, 2018 at 02:44:15PM -0800, Paul E. McKenney wrote:
> > On Sun, Jan 14, 2018 at 02:31:07PM -0800, tip-bot for Thomas Gleixner wrote:
> > > Commit-ID:  ed4bbf7910b28ce3c691aef28d245585eaabda06
> > > Gitweb:     https://git.kernel.org/tip/ed4bbf7910b28ce3c691aef28d245585eaabda06
> > > Author:     Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > > AuthorDate: Sun, 14 Jan 2018 23:19:49 +0100
> > > Committer:  Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > > CommitDate: Sun, 14 Jan 2018 23:25:33 +0100
> > > 
> > > timers: Unconditionally check deferrable base
> > > 
> > > When the timer base is checked for expired timers then the deferrable base
> > > must be checked as well. This was missed when making the deferrable base
> > > independent of base::nohz_active.
> > 
> > Very cool, thank you!  I have fired up some of the dreaded TREE01 tests.
> 
> And (tentatively) a change in behavior!

That's the least thing I expected. None of this stuff should use deferrable
timers. But what do I know.

> In one case, I did not see stall warnings (at least not until the ftrace
> dump at test end), but just the occasional "Waylayed timer" message:
> 
> [ 1316.628105] schedule_timeout: Waylayed timer base->clk: 0x1000eff23 jiffies: 0x1000f8335 base->next_expiry: 0x1000eff23 timer->flags: 0x3c800006 timer->expires 0x1000f62ca idx: f2 idx_now: 139 base->pending_map 000000080000000000000000000000000000000000000000100400002000000000000000000000000000010000000000000000000000000000000000000000000000000000000000
> [ 1316.638372] Torture onoff task state:
> [ 1316.639774] rcu_preempt     I14984     8      2 0x80000000
> [ 1316.641766] Call Trace:
> [ 1316.642757]  ? __schedule+0x33c/0x6f0
> [ 1316.644109]  ? trace_event_raw_event_timer_start+0x6d/0xb0
> [ 1316.646042]  schedule+0x37/0x90
> [ 1316.647219]  schedule_timeout+0x159/0x470
> [ 1316.648825]  ? __next_timer_interrupt+0xc0/0xc0
> [ 1316.650459]  rcu_gp_kthread+0x9ee/0x1090
> [ 1316.652147]  kthread+0xf0/0x130
> [ 1316.653415]  ? rcu_oom_notify+0xd0/0xd0
> [ 1316.654790]  ? kthread_destroy_worker+0x40/0x40
> [ 1316.656470]  ret_from_fork+0x1f/0x30
> 
> If I am reading the waylay message correctly, the timer was delayed by
> about 8.3 seconds instead of the 21 seconds required to cause an RCU
> CPU stall warning.  I am going to rearrange the ftrace-dump code a bit
> and run more tests.

Would be interesting to stop the trace right when you detect the waylay.

> [  382.482303]  ret_from_fork+0x1f/0x30
> 
> I don't have enough runs to be sure, but my current (sad) guess is that
> the error rate is unchanged.

That patch really should not make any difference. We are wading through
traces and are still puzzled....

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Stable Commits]     [Linux Stable Kernel]     [Linux Kernel]     [Linux USB Devel]     [Linux Video &Media]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux