Re: [PATCH v2 2/3] errseq: Add mechanism to snapshot errseq_counter and check snapshot

Sargun Dhillon <sargun@xxxxxxxxx> · Sun, 13 Dec 2020 19:41:34 +0000

On Sat, Dec 12, 2020 at 11:57:52AM +0200, Amir Goldstein wrote:
> Forgot to CC Jeff?
> 
Oops.
> On Sat, Dec 12, 2020 at 1:50 AM Sargun Dhillon <sargun@xxxxxxxxx> wrote:
> >
> > This adds the function errseq_counter_sample to allow for "subscribers"
> > to take point-in-time snapshots of the errseq_counter, and store the
> > counter + errseq_t.
> >
> > Signed-off-by: Sargun Dhillon <sargun@xxxxxxxxx>
> > ---
> >  include/linux/errseq.h |  4 ++++
> >  lib/errseq.c           | 51 ++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 55 insertions(+)
> >
> > diff --git a/include/linux/errseq.h b/include/linux/errseq.h
> > index 35818c484290..8998df499a3b 100644
> > --- a/include/linux/errseq.h
> > +++ b/include/linux/errseq.h
> > @@ -25,4 +25,8 @@ errseq_t errseq_set(errseq_t *eseq, int err);
> >  errseq_t errseq_sample(errseq_t *eseq);
> >  int errseq_check(errseq_t *eseq, errseq_t since);
> >  int errseq_check_and_advance(errseq_t *eseq, errseq_t *since);
> > +void errseq_counter_sample(errseq_t *dst_errseq, int *dst_errors,
> > +                          struct errseq_counter *counter);
> > +int errseq_counter_check(struct errseq_counter *counter, errseq_t errseq_since,
> > +                        int errors_since);
> >  #endif
> > diff --git a/lib/errseq.c b/lib/errseq.c
> > index d555e7fc18d2..98fcfafa3d97 100644
> > --- a/lib/errseq.c
> > +++ b/lib/errseq.c
> > @@ -246,3 +246,54 @@ int errseq_check_and_advance(errseq_t *eseq, errseq_t *since)
> >         return err;
> >  }
> >  EXPORT_SYMBOL(errseq_check_and_advance);
> > +
> > +/**
> > + * errseq_counter_sample() - Grab the current errseq_counter value
> > + * @dst_errseq: The errseq_t to copy to
> > + * @dst_errors: The destination overflow to copy to
> > + * @counter: The errseq_counter to copy from
> > + *
> > + * Grabs a point in time sample of the errseq_counter for latter comparison
> > + */
> > +void errseq_counter_sample(errseq_t *dst_errseq, int *dst_errors,
> 
> Why 2 arguments and not struct errseq_counter *dst_counter?
> 

Mostly not to have to use atomic_* when setting this value and avoiding locking 
another cacheline on the CPU. IIRC, atomic_t is always 4-byte aligned but int 
doesn't have to be.

> > +                          struct errseq_counter *counter)
> > +{
> > +       errseq_t cur;
> > +
> > +       do {
> > +               cur = READ_ONCE(counter->errseq);
> > +               *dst_errors = atomic_read(&counter->errors);
> > +       } while (cur != READ_ONCE(counter->errseq));
> 
> This loop seems odd. I think the return value should reflect the fact that
> the snapshot failed and let the caller decide if it wants to loop.
> 
> And about the one and only introduced caller, I think the answer is that
> it shouldn't loop. If volatile overlayfs mount tries to sample the upper sb
> error counter and an unseen error exists, I argued before that I think
> mount should fail, so that the container orchestrator can decide what to do.
> Failure to take an errseq_counter sample means than an unseen error
> has been observed at least in the first or second check.
> 

I guess. In the "good" case, there's the same computational cost, but the bad
case (error occurs while we are snapshotting results in another spin.

> > +
> > +       /* Clear the seen bit to make checking later easier */
> > +       *dst_errseq = cur & ~ERRSEQ_SEEN;
> > +}
> > +EXPORT_SYMBOL(errseq_counter_sample);
> > +
> > +/**
> > + * errseq_counter_check() - Has an error occurred since the sample
> > + * @counter: The errseq_counter from which to check.
> > + * @errseq_since: The errseq_t sampled with errseq_counter_sample to check
> > + * @errors_since: The errors sampled with errseq_counter_sample to check
> > + *
> > + * Returns: The latest error set in the errseq_t or 0 if there have been none.
> > + */
> > +int errseq_counter_check(struct errseq_counter *counter, errseq_t errseq_since,
> > +                        int errors_since)
> > +{
> > +       errseq_t cur_errseq;
> > +       int cur_errors;
> > +
> > +       cur_errors = atomic_read(&counter->errors);
> > +       /* To match the barrier in errseq_counter_set */
> > +       smp_rmb();
> > +
> > +       /* Clear / ignore the seen bit as we do at sample time */
> > +       cur_errseq = READ_ONCE(counter->errseq) & ~ERRSEQ_SEEN;
> > +
> > +       if (cur_errseq == errseq_since && errors_since == cur_errors)
> > +               return 0;
> > +
> > +       return -(cur_errseq & MAX_ERRNO);
> > +}
> 
> 
> Same here. Why not pass an errseq_counter_since argument?
> 
> Thanks,
> Amir.

See above. I can change this, and I mulled over this decision a bunch, 
unfortunately (micro)benchmarking was inconclusive as to whether this made a 
difference or not.