On Sat, 2024-09-14 at 13:10 -0700, John Stultz wrote: > On Sat, Sep 14, 2024 at 10:07 AM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > > > For multigrain timestamps, we must keep track of the latest timestamp > > that has ever been handed out, and never hand out a coarse time below > > that value. > > > > Add a static singleton atomic64_t into timekeeper.c that we can use to > > keep track of the latest fine-grained time ever handed out. This is > > tracked as a monotonic ktime_t value to ensure that it isn't affected by > > clock jumps. > > > > Add two new public interfaces: > > > > - ktime_get_coarse_real_ts64_mg() fills a timespec64 with the later of the > > coarse-grained clock and the floor time > > > > - ktime_get_real_ts64_mg() gets the fine-grained clock value, and tries > > to swap it into the floor. A timespec64 is filled with the result. > > > > Since the floor is global, we take great pains to avoid updating it > > unless it's absolutely necessary. If we do the cmpxchg and find that the > > value has been updated since we fetched it, then we discard the > > fine-grained time that was fetched in favor of the recent update. > > > > To maximize the window of this occurring when multiple tasks are racing > > to update the floor, ktime_get_coarse_real_ts64_mg returns a cookie > > value that represents the state of the floor tracking word, and > > ktime_get_real_ts64_mg accepts a cookie value that it uses as the "old" > > value when calling cmpxchg(). > > This last bit seems out of date. > Thanks. Dropped the last paragraph. > > --- > > include/linux/timekeeping.h | 4 +++ > > kernel/time/timekeeping.c | 82 +++++++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 86 insertions(+) > > > > diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h > > index fc12a9ba2c88..7aa85246c183 100644 > > --- a/include/linux/timekeeping.h > > +++ b/include/linux/timekeeping.h > > @@ -45,6 +45,10 @@ extern void ktime_get_real_ts64(struct timespec64 *tv); > > extern void ktime_get_coarse_ts64(struct timespec64 *ts); > > extern void ktime_get_coarse_real_ts64(struct timespec64 *ts); > > > > +/* Multigrain timestamp interfaces */ > > +extern void ktime_get_coarse_real_ts64_mg(struct timespec64 *ts); > > +extern void ktime_get_real_ts64_mg(struct timespec64 *ts); > > + > > void getboottime64(struct timespec64 *ts); > > > > /* > > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c > > index 5391e4167d60..16937242b904 100644 > > --- a/kernel/time/timekeeping.c > > +++ b/kernel/time/timekeeping.c > > @@ -114,6 +114,13 @@ static struct tk_fast tk_fast_raw ____cacheline_aligned = { > > .base[1] = FAST_TK_INIT, > > }; > > > > +/* > > + * This represents the latest fine-grained time that we have handed out as a > > + * timestamp on the system. Tracked as a monotonic ktime_t, and converted to the > > + * realtime clock on an as-needed basis. > > + */ > > +static __cacheline_aligned_in_smp atomic64_t mg_floor; > > + > > static inline void tk_normalize_xtime(struct timekeeper *tk) > > { > > while (tk->tkr_mono.xtime_nsec >= ((u64)NSEC_PER_SEC << tk->tkr_mono.shift)) { > > @@ -2394,6 +2401,81 @@ void ktime_get_coarse_real_ts64(struct timespec64 *ts) > > } > > EXPORT_SYMBOL(ktime_get_coarse_real_ts64); > > > > +/** > > + * ktime_get_coarse_real_ts64_mg - get later of coarse grained time or floor > > + * @ts: timespec64 to be filled > > + * > > + * Adjust floor to realtime and compare it to the coarse time. Fill > > + * @ts with the latest one. Note that this is a filesystem-specific > > + * interface and should be avoided outside of that context. > > + */ > > +void ktime_get_coarse_real_ts64_mg(struct timespec64 *ts) > > +{ > > + struct timekeeper *tk = &tk_core.timekeeper; > > + u64 floor = atomic64_read(&mg_floor); > > + ktime_t f_real, offset, coarse; > > + unsigned int seq; > > + > > + WARN_ON(timekeeping_suspended); > > + > > + do { > > + seq = read_seqcount_begin(&tk_core.seq); > > + *ts = tk_xtime(tk); > > + offset = *offsets[TK_OFFS_REAL]; > > + } while (read_seqcount_retry(&tk_core.seq, seq)); > > + > > + coarse = timespec64_to_ktime(*ts); > > + f_real = ktime_add(floor, offset); > > + if (ktime_after(f_real, coarse)) > > + *ts = ktime_to_timespec64(f_real); > > +} > > +EXPORT_SYMBOL_GPL(ktime_get_coarse_real_ts64_mg); > > + > > +/** > > + * ktime_get_real_ts64_mg - attempt to update floor value and return result > > + * @ts: pointer to the timespec to be set > > + * > > + * Get a current monotonic fine-grained time value and attempt to swap > > + * it into the floor. @ts will be filled with the resulting floor value, > > + * regardless of the outcome of the swap. Note that this is a filesystem > > + * specific interface and should be avoided outside of that context. > > + */ > > +void ktime_get_real_ts64_mg(struct timespec64 *ts, u64 cookie) > > Still passing a cookie. It doesn't match the header definition, so I'm > surprised this builds. > Yeah, I didn't see a warning when I built it. Luckily the extra parameter is ignored anyway, so it no harm. I've fixed that up in my tree. > > +{ > > + struct timekeeper *tk = &tk_core.timekeeper; > > + ktime_t old = atomic64_read(&mg_floor); > > + ktime_t offset, mono; > > + unsigned int seq; > > + u64 nsecs; > > + > > + WARN_ON(timekeeping_suspended); > > + > > + do { > > + seq = read_seqcount_begin(&tk_core.seq); > > + > > + ts->tv_sec = tk->xtime_sec; > > + mono = tk->tkr_mono.base; > > + nsecs = timekeeping_get_ns(&tk->tkr_mono); > > + offset = *offsets[TK_OFFS_REAL]; > > + } while (read_seqcount_retry(&tk_core.seq, seq)); > > + > > + mono = ktime_add_ns(mono, nsecs); > > + > > + if (atomic64_try_cmpxchg(&mg_floor, &old, mono)) { > > + ts->tv_nsec = 0; > > + timespec64_add_ns(ts, nsecs); > > + } else { > > + /* > > + * Something has changed mg_floor since "old" was > > + * fetched. "old" has now been updated with the > > + * current value of mg_floor, so use that to return > > + * the current coarse floor value. > > + */ > > + *ts = ktime_to_timespec64(ktime_add(old, offset)); > > + } > > +} > > +EXPORT_SYMBOL_GPL(ktime_get_real_ts64_mg); > > Other than those issues, I'm ok with it. Thanks again for working > through my concerns! > > Since I'm traveling for LPC soon, to save the next cycle, once the > fixes above are sorted: > Acked-by: John Stultz <jstultz@xxxxxxxxxx> > Thanks for the review! -- Jeff Layton <jlayton@xxxxxxxxxx>