On Wed, 30 Jan 2019 at 12:16, Vincent Guittot <vincent.guittot@xxxxxxxxxx> wrote: > > A deadlock has been seen when swicthing clocksources which use PM runtime. > The call path is: > change_clocksource > ... > write_seqcount_begin > ... > timekeeping_update > ... > sh_cmt_clocksource_enable > ... > rpm_resume Hmm, isn't this already a problem in genpd then? In genpd's ->runtime_resume() callback (genpd_runtime_resume()) we call ktime_get() to measure the latency of running the callback. Or, perhaps irq_safe_dev_in_no_sleep_domain() returns true for the genpd + device in question, which means the ktime thingy is skipped. Geert? > pm_runtime_mark_last_busy > ktime_get > do > read_seqcount_begin > while read_seqcount_retry > .... > write_seqcount_end > > Although we should be safe because we haven't yet changed the clocksource > at that time, we can't because of seqcount protection. > > Use ktime_get_mono_fast_ns() instead which is lock safe for such case > > With ktime_get_mono_fast_ns, the timestamp is not guaranteed to be > monotonic across an update and as a result can goes backward. According to > update_fast_timekeeper() description: "In the worst case, this can > result is a slightly wrong timestamp (a few nanoseconds)". For > PM runtime autosuspend, this means only that the suspend decision can > be slightly sub optimal. > > Fixes: 8234f6734c5d ("PM-runtime: Switch autosuspend over to using hrtimers") > Reported-by: Biju Das <biju.das@xxxxxxxxxxxxxx> > Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx> Reviewed-by: Ulf Hansson <ulf.hansson@xxxxxxxxxx> Kind regards Uffe > --- > > - v2: Updated commit message to explain the impact of using > ktime_get_mono_fast_ns() > > drivers/base/power/runtime.c | 10 +++++----- > include/linux/pm_runtime.h | 2 +- > 2 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c > index 457be03..708a13f 100644 > --- a/drivers/base/power/runtime.c > +++ b/drivers/base/power/runtime.c > @@ -130,7 +130,7 @@ u64 pm_runtime_autosuspend_expiration(struct device *dev) > { > int autosuspend_delay; > u64 last_busy, expires = 0; > - u64 now = ktime_to_ns(ktime_get()); > + u64 now = ktime_get_mono_fast_ns(); > > if (!dev->power.use_autosuspend) > goto out; > @@ -909,7 +909,7 @@ static enum hrtimer_restart pm_suspend_timer_fn(struct hrtimer *timer) > * If 'expires' is after the current time, we've been called > * too early. > */ > - if (expires > 0 && expires < ktime_to_ns(ktime_get())) { > + if (expires > 0 && expires < ktime_get_mono_fast_ns()) { > dev->power.timer_expires = 0; > rpm_suspend(dev, dev->power.timer_autosuspends ? > (RPM_ASYNC | RPM_AUTO) : RPM_ASYNC); > @@ -928,7 +928,7 @@ static enum hrtimer_restart pm_suspend_timer_fn(struct hrtimer *timer) > int pm_schedule_suspend(struct device *dev, unsigned int delay) > { > unsigned long flags; > - ktime_t expires; > + u64 expires; > int retval; > > spin_lock_irqsave(&dev->power.lock, flags); > @@ -945,8 +945,8 @@ int pm_schedule_suspend(struct device *dev, unsigned int delay) > /* Other scheduled or pending requests need to be canceled. */ > pm_runtime_cancel_pending(dev); > > - expires = ktime_add(ktime_get(), ms_to_ktime(delay)); > - dev->power.timer_expires = ktime_to_ns(expires); > + expires = ktime_get_mono_fast_ns() + (u64)delay * NSEC_PER_MSEC); > + dev->power.timer_expires = expires; > dev->power.timer_autosuspends = 0; > hrtimer_start(&dev->power.suspend_timer, expires, HRTIMER_MODE_ABS); > > diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h > index 54af4ee..fed5be7 100644 > --- a/include/linux/pm_runtime.h > +++ b/include/linux/pm_runtime.h > @@ -105,7 +105,7 @@ static inline bool pm_runtime_callbacks_present(struct device *dev) > > static inline void pm_runtime_mark_last_busy(struct device *dev) > { > - WRITE_ONCE(dev->power.last_busy, ktime_to_ns(ktime_get())); > + WRITE_ONCE(dev->power.last_busy, ktime_get_mono_fast_ns()); > } > > static inline bool pm_runtime_is_irq_safe(struct device *dev) > -- > 2.7.4 >