On Wed, 30 Jan 2019 at 10:39, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote: > > On Wed, Jan 30, 2019 at 10:14 AM Vincent Guittot > <vincent.guittot@xxxxxxxxxx> wrote: > > > > Hi Geert, > > > > On Wed, 30 Jan 2019 at 09:21, Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > > > > > > Hi Vincent, > > > > > > On Wed, Jan 30, 2019 at 9:16 AM Vincent Guittot > > > <vincent.guittot@xxxxxxxxxx> wrote: > > > > A deadlock has been seen when swicthing clocksources which use PM runtime. > > > > The call path is: > > > > change_clocksource > > > > ... > > > > write_seqcount_begin > > > > ... > > > > timekeeping_update > > > > ... > > > > sh_cmt_clocksource_enable > > > > ... > > > > rpm_resume > > > > pm_runtime_mark_last_busy > > > > ktime_get > > > > do > > > > read_seqcount_begin > > > > while read_seqcount_retry > > > > .... > > > > write_seqcount_end > > > > > > > > Although we should be safe because we haven't yet changed the clocksource > > > > at that time, we can't because of seqcount protection. > > > > > > > > Use ktime_get_mono_fast_ns instead which is lock safe for such case > > > > > > > > Fixes: 8234f6734c5d ("PM-runtime: Switch autosuspend over to using hrtimers") > > > > Reported-by: Biju Das <biju.das@xxxxxxxxxxxxxx> > > > > Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx> > > > > > > Thanks for your patch! > > > > > > /** > > > * ktime_get_mono_fast_ns - Fast NMI safe access to clock monotonic > > > * > > > * This timestamp is not guaranteed to be monotonic across an update. > > > * The timestamp is calculated by: > > > * > > > * now = base_mono + clock_delta * slope > > > * > > > * So if the update lowers the slope, readers who are forced to the > > > * not yet updated second array are still using the old steeper slope. > > > * > > > * tmono > > > * ^ > > > * | o n > > > * | o n > > > * | u > > > * | o > > > * |o > > > * |12345678---> reader order > > > * > > > * o = old slope > > > * u = update > > > * n = new slope > > > * > > > * So reader 6 will observe time going backwards versus reader 5. > > > * > > > * While other CPUs are likely to be able observe that, the only way > > > * for a CPU local observation is when an NMI hits in the middle of > > > * the update. Timestamps taken from that NMI context might be ahead > > > * of the following timestamps. Callers need to be aware of that and > > > * deal with it. > > > */ > > > > > > As this function is not guaranteed to be monotonic, have you checked how > > > the Runtime PM code behaves if time goes backwards? Does it just make > > > a suboptimal decision or does it crash? > > > > As a worst case this will generate a suboptimal decision around the update > > So that should be explained in the changelog of the patch. In detail, > if poss, please. Ok, I'm going to update the commit message