On Sun, Dec 13 2020, Jeff Layton wrote: > Overlayfs's volatile mounts want to be able to sample an error for > their own purposes, without preventing a later opener from potentially > seeing the error. > > The original reason for the SEEN flag was to make it so that we didn't > need to increment the counter if nothing had observed the latest value > and the error was the same. Eventually, a regression was reported in > the errseq_t conversion, and we fixed that by using the SEEN flag to > also mean that the error had been reported to userland at least once > somewhere. > > Those are two different states, however. If we instead take a second > flag bit from the counter, we can track these two things separately, > and accomodate the overlayfs volatile mount use-case. > > Add a new MUSTINC flag that indicates that the counter must be > incremented the next time an error is set, and rework the errseq > functions to set and clear that flag whenever the SEEN bit is set or > cleared. > > Test only for the MUSTINC bit when deciding whether to increment the > counter and only for the SEEN bit when deciding what to return in > errseq_sample. > > Add a new errseq_peek function to allow for the overlayfs use-case. > This just grabs the latest counter and sets the MUSTINC bit, leaving > the SEEN bit untouched. > > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx> > --- > include/linux/errseq.h | 2 ++ > lib/errseq.c | 64 ++++++++++++++++++++++++++++++++++-------- > 2 files changed, 55 insertions(+), 11 deletions(-) > > diff --git a/include/linux/errseq.h b/include/linux/errseq.h > index fc2777770768..6d4b9bc629ac 100644 > --- a/include/linux/errseq.h > +++ b/include/linux/errseq.h > @@ -9,6 +9,8 @@ typedef u32 errseq_t; > > errseq_t errseq_set(errseq_t *eseq, int err); > errseq_t errseq_sample(errseq_t *eseq); > +errseq_t errseq_peek(errseq_t *eseq); > +errseq_t errseq_sample_advance(errseq_t *eseq); > int errseq_check(errseq_t *eseq, errseq_t since); > int errseq_check_and_advance(errseq_t *eseq, errseq_t *since); > #endif > diff --git a/lib/errseq.c b/lib/errseq.c > index 81f9e33aa7e7..5cc830f0361b 100644 > --- a/lib/errseq.c > +++ b/lib/errseq.c > @@ -38,8 +38,11 @@ > /* This bit is used as a flag to indicate whether the value has been seen */ > #define ERRSEQ_SEEN (1 << ERRSEQ_SHIFT) Would this look nicer using the BIT() macro? #define ERRSEQ_SEEN BIT(ERRSEQ_SHIFT) > > +/* This bit indicates that value must be incremented even when error is same */ > +#define ERRSEQ_MUSTINC (1 << (ERRSEQ_SHIFT + 1)) #define ERRSEQ_MUSTINC BIT(ERRSEQ_SHIFT+1) or if you don't like the BIT macro (not everyone does), then maybe #define ERR_SEQ_MUSTINC (ERRSEQ_SEEN << 1 ) ?? > + > /* The lowest bit of the counter */ > -#define ERRSEQ_CTR_INC (1 << (ERRSEQ_SHIFT + 1)) > +#define ERRSEQ_CTR_INC (1 << (ERRSEQ_SHIFT + 2)) Ditto. > > /** > * errseq_set - set a errseq_t for later reporting > @@ -77,11 +80,11 @@ errseq_t errseq_set(errseq_t *eseq, int err) > for (;;) { > errseq_t new; > > - /* Clear out error bits and set new error */ > - new = (old & ~(MAX_ERRNO|ERRSEQ_SEEN)) | -err; > + /* Clear out flag bits and set new error */ > + new = (old & ~(MAX_ERRNO|ERRSEQ_SEEN|ERRSEQ_MUSTINC)) | -err; This is starting to look clumsy (or maybe, this already looked clumsy, but now that is hard to ignore). new = (old & (ERRSEQ_CTR_INC - 1)) | -err Also this assumes MAX_ERRNO is a mask, which it is .. today. BUILD_BUG_ON(MAX_ERRNO & (MAX_ERRNO + 1)); ?? > > - /* Only increment if someone has looked at it */ > - if (old & ERRSEQ_SEEN) > + /* Only increment if we have to */ > + if (old & ERRSEQ_MUSTINC) > new += ERRSEQ_CTR_INC; > > /* If there would be no change, then call it done */ > @@ -122,14 +125,50 @@ EXPORT_SYMBOL(errseq_set); > errseq_t errseq_sample(errseq_t *eseq) > { > errseq_t old = READ_ONCE(*eseq); > + errseq_t new = old; > > - /* If nobody has seen this error yet, then we can be the first. */ > - if (!(old & ERRSEQ_SEEN)) > - old = 0; > - return old; > + /* > + * For the common case of no errors ever having been set, we can skip > + * marking the SEEN|MUSTINC bits. Once an error has been set, the value > + * will never go back to zero. > + */ > + if (old != 0) { > + new |= ERRSEQ_SEEN|ERRSEQ_MUSTINC; You lose me here. Why is ERRSEQ_SEEN being set, where it wasn't before? The ERRSEQ_SEEN flag not means precisely "The error has been reported to userspace". This operations isn't used to report errors - that is errseq_check(). I'm not saying the code it wrong - I really cannot tell. I'm just saying that I cannot see why it might be right. Thanks, NeilBrown > + if (old != new) > + cmpxchg(eseq, old, new); > + if (!(old & ERRSEQ_SEEN)) > + return 0; > + } > + return new; > } > EXPORT_SYMBOL(errseq_sample); > > +/** > + * errseq_peek - Grab current errseq_t value, but don't mark it SEEN > + * @eseq: Pointer to errseq_t to be sampled. > + * > + * In some cases, we need to be able to sample the errseq_t, but we're not > + * in a situation where we can report the value to userland. Use this > + * function to do that. This ensures that later errors will be recorded, > + * and that any current errors are reported at least once. > + * > + * Context: Any context. > + * Return: The current errseq value. > + */ > +errseq_t errseq_peek(errseq_t *eseq) > +{ > + errseq_t old = READ_ONCE(*eseq); > + errseq_t new = old; > + > + if (old != 0) { > + new |= ERRSEQ_MUSTINC; > + if (old != new) > + cmpxchg(eseq, old, new); > + } > + return new; > +} > +EXPORT_SYMBOL(errseq_peek); > + > /** > * errseq_check() - Has an error occurred since a particular sample point? > * @eseq: Pointer to errseq_t value to be checked. > @@ -143,7 +182,10 @@ EXPORT_SYMBOL(errseq_sample); > */ > int errseq_check(errseq_t *eseq, errseq_t since) > { > - errseq_t cur = READ_ONCE(*eseq); > + errseq_t cur = READ_ONCE(*eseq) & ~(ERRSEQ_MUSTINC|ERRSEQ_SEEN); > + > + /* Clear the flag bits for comparison */ > + since &= ~(ERRSEQ_MUSTINC|ERRSEQ_SEEN); > > if (likely(cur == since)) > return 0; > @@ -195,7 +237,7 @@ int errseq_check_and_advance(errseq_t *eseq, errseq_t *since) > * can advance "since" and return an error based on what we > * have. > */ > - new = old | ERRSEQ_SEEN; > + new = old | ERRSEQ_SEEN | ERRSEQ_MUSTINC; > if (new != old) > cmpxchg(eseq, old, new); > *since = new; > -- > 2.29.2
Attachment:
signature.asc
Description: PGP signature