Re: [PATCH v2 3/3] reftable: prevent 'update_index' changes after adding records

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 21, 2025 at 04:34:12AM +0100, Karthik Nayak wrote:
> The function `reftable_writer_set_limits()` allows updating the
> 'min_update_index' and 'max_update_index' of a reftable writer. These
> values are written to both the writer's header and footer.
> 
> Since the header is written during the first block write, any subsequent
> changes to the update index would create a mismatch between the header
> and footer values. The footer would contain the newer values while the
> header retained the original ones.
> 
> To fix this bug, prevent callers from updating these values after any

Nit: it's not really fixing a bug, but protecting us against it. Not
worth a reroll though, from my point of view.

> diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
> index 5f9afa620bb00de66c311765fb0ae8c6f56401ae..1ea014d389cc47f173279e3234a82f3fcbc807a0 100644
> --- a/reftable/reftable-writer.h
> +++ b/reftable/reftable-writer.h
> @@ -124,17 +124,21 @@ int reftable_writer_new(struct reftable_writer **out,
>  			int (*flush_func)(void *),
>  			void *writer_arg, const struct reftable_write_options *opts);
>  
> -/* Set the range of update indices for the records we will add. When writing a
> -   table into a stack, the min should be at least
> -   reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
> -
> -   For transactional updates to a stack, typically min==max, and the
> -   update_index can be obtained by inspeciting the stack. When converting an
> -   existing ref database into a single reftable, this would be a range of
> -   update-index timestamps.
> +/*
> + * Set the range of update indices for the records we will add. When writing a
> + * table into a stack, the min should be at least
> + * reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned.
> + *
> + * For transactional updates to a stack, typically min==max, and the
> + * update_index can be obtained by inspeciting the stack. When converting an
> + * existing ref database into a single reftable, this would be a range of
> + * update-index timestamps.
> + *
> + * The function should be called before adding any records to the writer. If not
> + * it will fail with REFTABLE_API_ERROR.
>   */

Thanks for updating this. I think the reftable library is one of those
code areas where it makes sense to sneak in a formatting fix every now
and then because its coding style is quite alien to Git's own in some
places. We could also do it all in one go, but I strongly doubt that it
would be worth the churn.

> -void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
> -				uint64_t max);
> +int reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
> +			       uint64_t max);
>  
>  /*
>    Add a reftable_ref_record. The record should have names that come after

> diff --git a/reftable/writer.c b/reftable/writer.c
> index 740c98038eaf883258bef4988f78977ac7e4a75a..03acbdbcce75fd51820c5fb016bd94f0f7f4914a 100644
> --- a/reftable/writer.c
> +++ b/reftable/writer.c
> @@ -179,11 +179,20 @@ int reftable_writer_new(struct reftable_writer **out,
>  	return 0;
>  }
>  
> -void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
> -				uint64_t max)
> +int reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
> +			       uint64_t max)
>  {
> +	/*
> +	 * The limits should be set before any records are added to the writer.
> +	 * Check if any records were added by checking if `last_key` was set.
> +	 */
> +	if (w->last_key.len)
> +		return REFTABLE_API_ERROR;

Hm. Using the last key feels somewhat dangerous to me as it does get
reset at times, e.g. when finishing writing the current section. It
_should_ work, but overall it just feels a tad to disconnected from the
thing that we actually want to check.

How about we instead use `next`? This variable records the offset of the
next block we're about to write, and `writer_flush_nonempty_block()`
uses it directly to check whether we're currently writing the first
block in order to decide whether it needs to write a header or not. If
it's 0, we know that we haven't written the first block yet. That feels
much closer aligned with what we're checking.

> diff --git a/t/unit-tests/t-reftable-stack.c b/t/unit-tests/t-reftable-stack.c
> index aeec195b2b1014445d71c5db39a9795017fd8ff2..b23edf18a7d75b0c2292490ad06d4dfaaa571e79 100644
> --- a/t/unit-tests/t-reftable-stack.c
> +++ b/t/unit-tests/t-reftable-stack.c

Can we maybe add a unit test that demonstrates the error?

Patrick




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux