On Mon, Jun 25, 2018 at 4:56 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Thu, Jun 21, 2018 at 05:25:17PM -0400, Pavel Tatashin wrote: > > Allow sched_clock() to be used before schec_clock_init() and > > sched_clock_init_late() are called. This provides us with a way to get > > early boot timestamps on machines with unstable clocks. > > There are !x86 architectures that use this code and might not expect to > have their sched_clock() called quite that early. Please verify. Correct, with this change from what I could gather so far, the other arches with unstable clocks either pick-up and start output time earlier or output 0. I could not test on every specific configuration of course. > > > Signed-off-by: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx> > > --- > > kernel/sched/clock.c | 10 ++++++++-- > > 1 file changed, 8 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c > > index 10c83e73837a..f034392b0f6c 100644 > > --- a/kernel/sched/clock.c > > +++ b/kernel/sched/clock.c > > @@ -205,6 +205,11 @@ void clear_sched_clock_stable(void) > > */ > > static int __init sched_clock_init_late(void) > > { > > + /* Transition to unstable clock from early clock */ > > This is wrong... or at least it smells horribly. > > This is not the point where we transition from early to unstable, that > is in fact in sched_clock_init. > > This function, sched_clock_init_late(), is where we attempt to > transition from unstable to stable. And this is _waaaay_ after SMP init. > > > + local_irq_disable(); > > + __gtod_offset = sched_clock() + __sched_clock_offset - ktime_get_ns(); > > + local_irq_enable(); > > This might work in sched_clock_init(), which is pre-SMP. > > > sched_clock_running = 2; > > /* > > * Ensure that it is impossible to not do a static_key update. > > @@ -350,8 +355,9 @@ u64 sched_clock_cpu(int cpu) > > if (sched_clock_stable()) > > return sched_clock() + __sched_clock_offset; > > > > - if (unlikely(!sched_clock_running)) > > - return 0ull; > > + /* Use early clock until sched_clock_init_late() */ > > + if (unlikely(sched_clock_running < 2)) > > + return sched_clock() + __sched_clock_offset; > > And then this remains !sched_clock_running, except instead of 0, you > then return sched_clock() + __sched_clock_offset; > > > preempt_disable_notrace(); > > scd = cpu_sdc(cpu); > All the above is because of: > > + if (unlikely(sched_clock_running < 2)) > > + return sched_clock() + __sched_clock_offset; Keep early boot clock until sched_clock_running sched_clock_init_late(). This can be changed to: + if (unlikely(!sched_clock_running)) + return sched_clock() + __sched_clock_offset; And do all the necessary changes in sched_clock_init(). The reason, why I had "< 2" in the first place, is because we did not have early static branching before, and ended-up with sched_clock() outputting: early TSC, jiffies, real TSC. The imprecise jiffies timestamps during boot were confusing. However, now with the early branching, and with upcoming change that Thomas suggested that guarantees that if early TSC are enabled we will keep them forever, the "< 2" is not needed. I will update this patch accordingly. Thank you, Pavel -- To unsubscribe from this list: send the line "unsubscribe linux-s390" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html