Patrick Steinhardt <ps@xxxxxx> writes: > - for (i = first; i <= last; i++) { > - stack_filename(&table_name, st, reader_name(st->readers[i])); > + for (i = last + 1; i > first; i--) { > + stack_filename(&table_name, st, reader_name(st->readers[i - 1])); > > err = hold_lock_file_for_update(&table_locks[nlocks], > table_name.buf, LOCK_NO_DEREF); > if (err < 0) { > - if (errno == EEXIST) > + /* > + * When the table is locked already we may do a > + * best-effort compaction and compact only the tables > + * that we have managed to lock so far. This of course > + * requires that we have been able to lock at least two > + * tables, otherwise there would be nothing to compact. > + * In that case, we return a lock error to our caller. > + */ > + if (errno == EEXIST && last - (i - 1) >= 2 && > + flags & STACK_COMPACT_RANGE_BEST_EFFORT) { > + err = 0; > + /* > + * The subtraction is to offset the index, the > + * addition is to only compact up to the table > + * of the preceding iteration. They obviously > + * cancel each other out, but that may be > + * non-obvious when it was omitted. > + */ > + first = (i - 1) + 1; > + break; OK, so this case no longer jumps to "done" label. Starting from the high end of the st->readers[] array down to what we manged to take locks on will be processed with the existing logic that compacted first..last, which makes sense. > + } else if (errno == EEXIST) { > err = REFTABLE_LOCK_ERROR; > - else > + goto done; > + } else { > err = REFTABLE_IO_ERROR; > - goto done; > + goto done; > + } > } > > /* > @@ -1270,7 +1308,7 @@ static int stack_compact_range(struct reftable_stack *st, > * delete the files after we closed them on Windows, so this needs to > * happen first. > */ > - err = reftable_stack_reload_maybe_reuse(st, first < last); > + err = reftable_stack_reload_maybe_reuse(st, first_to_replace < last_to_replace); What is this change about? No code changed that computes first_to_replace and last_to_replace? Perhaps before this step, using first/last and using first_to_replace/last_to_replace did not make any difference, because we never dealt with a case where we failed to lock any of the tables? I am wondering if this would be helped by a no-op clarification before the actual behaviour change, similar to how step 4/8 added the nlocks variable. Thanks.