Re: [PATCH v4 13/27] lib: add errseq_t type and infrastructure for handling it

Jan Kara <jack@xxxxxxx> · Wed, 10 May 2017 13:34:30 +0200

On Tue 09-05-17 11:49:16, Jeff Layton wrote:
> An errseq_t is a way of recording errors in one place, and allowing any
> number of "subscribers" to tell whether an error has been set again
> since a previous time.
> 
> It's implemented as an unsigned 32-bit value that is managed with atomic
> operations. The low order bits are designated to hold an error code
> (max size of MAX_ERRNO). The upper bits are used as a counter.
> 
> The API works with consumers sampling an errseq_t value at a particular
> point in time. Later, that value can be used to tell whether new errors
> have been set since that time.
> 
> Note that there is a 1 in 512k risk of collisions here if new errors
> are being recorded frequently, since we have so few bits to use as a
> counter. To mitigate this, one bit is used as a flag to tell whether the
> value has been sampled since a new value was recorded. That allows
> us to avoid bumping the counter if no one has sampled it since it
> was last bumped.
> 
> Later patches will build on this infrastructure to change how writeback
> errors are tracked in the kernel.
> 
> Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>

The patch looks good to me. Feel free to add:

Reviewed-by: Jan Kara <jack@xxxxxxx>

Just two nits below:
...
> +int errseq_check_and_advance(errseq_t *eseq, errseq_t *since)
> +{
> +	int err = 0;
> +	errseq_t old, new;
> +
> +	/*
> +	 * Most callers will want to use the inline wrapper to check this,
> +	 * so that the common case of no error is handled without needing
> +	 * to lock.
> +	 */

I'm not sure which locking you are speaking about here. Is the comment
stale?

> +	old = READ_ONCE(*eseq);
> +	if (old != *since) {
> +		/*
> +		 * Set the flag and try to swap it into place if it has
> +		 * changed.
> +		 *
> +		 * We don't care about the outcome of the swap here. If the
> +		 * swap doesn't occur, then it has either been updated by a
> +		 * writer who is bumping the seq count anyway, or another

"bumping the seq count anyway" part is not quite true. Writer may see
ERRSEQ_SEEN not set and so just update the error code and leave seq count
as is. But since you compare full errseq_t for equality, this works out as
designed...

> +		 * reader who is just setting the "seen" flag. Either outcome
> +		 * is OK, and we can advance "since" and return an error based
> +		 * on what we have.
> +		 */
> +		new = old | ERRSEQ_SEEN;
> +		if (new != old)
> +			cmpxchg(eseq, old, new);
> +		*since = new;
> +		err = -(new & MAX_ERRNO);
> +	}
> +	return err;
> +}
> +EXPORT_SYMBOL(errseq_check_and_advance);

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR