Re: [PATCH 8/8] reftable/stack: handle locked tables during auto-compaction

Junio C Hamano <gitster@xxxxxxxxx> · Fri, 02 Aug 2024 16:24:42 -0700

Patrick Steinhardt <ps@xxxxxx> writes:

> -	for (i = first; i <= last; i++) {
> -		stack_filename(&table_name, st, reader_name(st->readers[i]));
> +	for (i = last + 1; i > first; i--) {
> +		stack_filename(&table_name, st, reader_name(st->readers[i - 1]));
>  
>  		err = hold_lock_file_for_update(&table_locks[nlocks],
>  						table_name.buf, LOCK_NO_DEREF);
>  		if (err < 0) {
> -			if (errno == EEXIST)
> +			/*
> +			 * When the table is locked already we may do a
> +			 * best-effort compaction and compact only the tables
> +			 * that we have managed to lock so far. This of course
> +			 * requires that we have been able to lock at least two
> +			 * tables, otherwise there would be nothing to compact.
> +			 * In that case, we return a lock error to our caller.
> +			 */
> +			if (errno == EEXIST && last - (i - 1) >= 2 &&
> +			    flags & STACK_COMPACT_RANGE_BEST_EFFORT) {
> +				err = 0;
> +				/*
> +				 * The subtraction is to offset the index, the
> +				 * addition is to only compact up to the table
> +				 * of the preceding iteration. They obviously
> +				 * cancel each other out, but that may be
> +				 * non-obvious when it was omitted.
> +				 */
> +				first = (i - 1) + 1;
> +				break;

OK, so this case no longer jumps to "done" label.  Starting from the
high end of the st->readers[] array down to what we manged to take
locks on will be processed with the existing logic that compacted
first..last, which makes sense.

> +			} else if (errno == EEXIST) {
>  				err = REFTABLE_LOCK_ERROR;
> -			else
> +				goto done;
> +			} else {
>  				err = REFTABLE_IO_ERROR;
> -			goto done;
> +				goto done;
> +			}
>  		}
>  
>  		/*
> @@ -1270,7 +1308,7 @@ static int stack_compact_range(struct reftable_stack *st,
>  	 * delete the files after we closed them on Windows, so this needs to
>  	 * happen first.
>  	 */
> -	err = reftable_stack_reload_maybe_reuse(st, first < last);
> +	err = reftable_stack_reload_maybe_reuse(st, first_to_replace < last_to_replace);

What is this change about?

No code changed that computes first_to_replace and last_to_replace?
Perhaps before this step, using first/last and using
first_to_replace/last_to_replace did not make any difference,
because we never dealt with a case where we failed to lock any of
the tables?

I am wondering if this would be helped by a no-op clarification
before the actual behaviour change, similar to how step 4/8 added
the nlocks variable.

Thanks.