On Wednesday, February 27, 2019 8:58:35 PM CET Ulf Hansson wrote: > To be able to predict the sleep duration for a CPU that is entering idle, > knowing when the next timer/tick is going to expire, is extremely useful. > Both the teo and the menu cpuidle governors already makes use of this > information, while selecting an idle state. > > Moving forward, the similar prediction needs to be done, but for a group of > idle CPUs rather than for a single idle CPU. Following changes implements a > new genpd governor, which needs this. > > Support this, by sharing a new function called > tick_nohz_get_next_hrtimer(), which returns the next hrtimer or the next > tick, whatever that expires first. > > Additionally, when cpuidle is about to invoke the ->enter() callback, then > call tick_nohz_get_next_hrtimer() and store its return value in the per CPU > struct cpuidle_device, as to make it available outside cpuidle. > > Do note, at the point when cpuidle calls tick_nohz_get_next_hrtimer(), the > governor's ->select() callback has already made a decision whether to stop > the tick or not. In this way, tick_nohz_get_next_hrtimer() actually returns > the next timer expiration, whatever origin. > > Cc: Lina Iyer <ilina@xxxxxxxxxxxxxx> > Co-developed-by: Lina Iyer <lina.iyer@xxxxxxxxxx> > Co-developed-by: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> > Signed-off-by: Ulf Hansson <ulf.hansson@xxxxxxxxxx> > --- > > Changes in v12: > - New patch. > > --- > drivers/cpuidle/cpuidle.c | 8 ++++++++ > include/linux/cpuidle.h | 1 + > include/linux/tick.h | 7 ++++++- > kernel/time/tick-sched.c | 12 ++++++++++++ > 4 files changed, 27 insertions(+), 1 deletion(-) > > diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c > index 7f108309e871..255365b1a6ab 100644 > --- a/drivers/cpuidle/cpuidle.c > +++ b/drivers/cpuidle/cpuidle.c > @@ -328,6 +328,14 @@ int cpuidle_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, > int cpuidle_enter(struct cpuidle_driver *drv, struct cpuidle_device *dev, > int index) > { > + /* > + * Store the next hrtimer, which becomes either next tick or the next > + * timer event, whatever expires first. Additionally, to make this data > + * useful for consumers outside cpuidle, we rely on that the governor's > + * ->select() callback have decided, whether to stop the tick or not. > + */ > + dev->next_hrtimer = tick_nohz_get_next_hrtimer(); I would use WRITE_ONCE() to set next_hrtimer here and READ_ONCE() for reading that value in the next patch, as a matter of annotation if nothing else. > + > if (cpuidle_state_is_coupled(drv, index)) > return cpuidle_enter_state_coupled(dev, drv, index); > return cpuidle_enter_state(dev, drv, index); Also I would clear next_hrtimer here to avoid dragging stale values around. Apart from this the series LGTM. Thanks!