On Sat, Dec 12, 2020 at 11:57:52AM +0200, Amir Goldstein wrote: > Forgot to CC Jeff? > Oops. > On Sat, Dec 12, 2020 at 1:50 AM Sargun Dhillon <sargun@xxxxxxxxx> wrote: > > > > This adds the function errseq_counter_sample to allow for "subscribers" > > to take point-in-time snapshots of the errseq_counter, and store the > > counter + errseq_t. > > > > Signed-off-by: Sargun Dhillon <sargun@xxxxxxxxx> > > --- > > include/linux/errseq.h | 4 ++++ > > lib/errseq.c | 51 ++++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 55 insertions(+) > > > > diff --git a/include/linux/errseq.h b/include/linux/errseq.h > > index 35818c484290..8998df499a3b 100644 > > --- a/include/linux/errseq.h > > +++ b/include/linux/errseq.h > > @@ -25,4 +25,8 @@ errseq_t errseq_set(errseq_t *eseq, int err); > > errseq_t errseq_sample(errseq_t *eseq); > > int errseq_check(errseq_t *eseq, errseq_t since); > > int errseq_check_and_advance(errseq_t *eseq, errseq_t *since); > > +void errseq_counter_sample(errseq_t *dst_errseq, int *dst_errors, > > + struct errseq_counter *counter); > > +int errseq_counter_check(struct errseq_counter *counter, errseq_t errseq_since, > > + int errors_since); > > #endif > > diff --git a/lib/errseq.c b/lib/errseq.c > > index d555e7fc18d2..98fcfafa3d97 100644 > > --- a/lib/errseq.c > > +++ b/lib/errseq.c > > @@ -246,3 +246,54 @@ int errseq_check_and_advance(errseq_t *eseq, errseq_t *since) > > return err; > > } > > EXPORT_SYMBOL(errseq_check_and_advance); > > + > > +/** > > + * errseq_counter_sample() - Grab the current errseq_counter value > > + * @dst_errseq: The errseq_t to copy to > > + * @dst_errors: The destination overflow to copy to > > + * @counter: The errseq_counter to copy from > > + * > > + * Grabs a point in time sample of the errseq_counter for latter comparison > > + */ > > +void errseq_counter_sample(errseq_t *dst_errseq, int *dst_errors, > > Why 2 arguments and not struct errseq_counter *dst_counter? > Mostly not to have to use atomic_* when setting this value and avoiding locking another cacheline on the CPU. IIRC, atomic_t is always 4-byte aligned but int doesn't have to be. > > + struct errseq_counter *counter) > > +{ > > + errseq_t cur; > > + > > + do { > > + cur = READ_ONCE(counter->errseq); > > + *dst_errors = atomic_read(&counter->errors); > > + } while (cur != READ_ONCE(counter->errseq)); > > This loop seems odd. I think the return value should reflect the fact that > the snapshot failed and let the caller decide if it wants to loop. > > And about the one and only introduced caller, I think the answer is that > it shouldn't loop. If volatile overlayfs mount tries to sample the upper sb > error counter and an unseen error exists, I argued before that I think > mount should fail, so that the container orchestrator can decide what to do. > Failure to take an errseq_counter sample means than an unseen error > has been observed at least in the first or second check. > I guess. In the "good" case, there's the same computational cost, but the bad case (error occurs while we are snapshotting results in another spin. > > + > > + /* Clear the seen bit to make checking later easier */ > > + *dst_errseq = cur & ~ERRSEQ_SEEN; > > +} > > +EXPORT_SYMBOL(errseq_counter_sample); > > + > > +/** > > + * errseq_counter_check() - Has an error occurred since the sample > > + * @counter: The errseq_counter from which to check. > > + * @errseq_since: The errseq_t sampled with errseq_counter_sample to check > > + * @errors_since: The errors sampled with errseq_counter_sample to check > > + * > > + * Returns: The latest error set in the errseq_t or 0 if there have been none. > > + */ > > +int errseq_counter_check(struct errseq_counter *counter, errseq_t errseq_since, > > + int errors_since) > > +{ > > + errseq_t cur_errseq; > > + int cur_errors; > > + > > + cur_errors = atomic_read(&counter->errors); > > + /* To match the barrier in errseq_counter_set */ > > + smp_rmb(); > > + > > + /* Clear / ignore the seen bit as we do at sample time */ > > + cur_errseq = READ_ONCE(counter->errseq) & ~ERRSEQ_SEEN; > > + > > + if (cur_errseq == errseq_since && errors_since == cur_errors) > > + return 0; > > + > > + return -(cur_errseq & MAX_ERRNO); > > +} > > > Same here. Why not pass an errseq_counter_since argument? > > Thanks, > Amir. See above. I can change this, and I mulled over this decision a bunch, unfortunately (micro)benchmarking was inconclusive as to whether this made a difference or not.