On Tue, Jan 21, 2025 at 04:34:12AM +0100, Karthik Nayak wrote: > The function `reftable_writer_set_limits()` allows updating the > 'min_update_index' and 'max_update_index' of a reftable writer. These > values are written to both the writer's header and footer. > > Since the header is written during the first block write, any subsequent > changes to the update index would create a mismatch between the header > and footer values. The footer would contain the newer values while the > header retained the original ones. > > To fix this bug, prevent callers from updating these values after any Nit: it's not really fixing a bug, but protecting us against it. Not worth a reroll though, from my point of view. > diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h > index 5f9afa620bb00de66c311765fb0ae8c6f56401ae..1ea014d389cc47f173279e3234a82f3fcbc807a0 100644 > --- a/reftable/reftable-writer.h > +++ b/reftable/reftable-writer.h > @@ -124,17 +124,21 @@ int reftable_writer_new(struct reftable_writer **out, > int (*flush_func)(void *), > void *writer_arg, const struct reftable_write_options *opts); > > -/* Set the range of update indices for the records we will add. When writing a > - table into a stack, the min should be at least > - reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned. > - > - For transactional updates to a stack, typically min==max, and the > - update_index can be obtained by inspeciting the stack. When converting an > - existing ref database into a single reftable, this would be a range of > - update-index timestamps. > +/* > + * Set the range of update indices for the records we will add. When writing a > + * table into a stack, the min should be at least > + * reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned. > + * > + * For transactional updates to a stack, typically min==max, and the > + * update_index can be obtained by inspeciting the stack. When converting an > + * existing ref database into a single reftable, this would be a range of > + * update-index timestamps. > + * > + * The function should be called before adding any records to the writer. If not > + * it will fail with REFTABLE_API_ERROR. > */ Thanks for updating this. I think the reftable library is one of those code areas where it makes sense to sneak in a formatting fix every now and then because its coding style is quite alien to Git's own in some places. We could also do it all in one go, but I strongly doubt that it would be worth the churn. > -void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min, > - uint64_t max); > +int reftable_writer_set_limits(struct reftable_writer *w, uint64_t min, > + uint64_t max); > > /* > Add a reftable_ref_record. The record should have names that come after > diff --git a/reftable/writer.c b/reftable/writer.c > index 740c98038eaf883258bef4988f78977ac7e4a75a..03acbdbcce75fd51820c5fb016bd94f0f7f4914a 100644 > --- a/reftable/writer.c > +++ b/reftable/writer.c > @@ -179,11 +179,20 @@ int reftable_writer_new(struct reftable_writer **out, > return 0; > } > > -void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min, > - uint64_t max) > +int reftable_writer_set_limits(struct reftable_writer *w, uint64_t min, > + uint64_t max) > { > + /* > + * The limits should be set before any records are added to the writer. > + * Check if any records were added by checking if `last_key` was set. > + */ > + if (w->last_key.len) > + return REFTABLE_API_ERROR; Hm. Using the last key feels somewhat dangerous to me as it does get reset at times, e.g. when finishing writing the current section. It _should_ work, but overall it just feels a tad to disconnected from the thing that we actually want to check. How about we instead use `next`? This variable records the offset of the next block we're about to write, and `writer_flush_nonempty_block()` uses it directly to check whether we're currently writing the first block in order to decide whether it needs to write a header or not. If it's 0, we know that we haven't written the first block yet. That feels much closer aligned with what we're checking. > diff --git a/t/unit-tests/t-reftable-stack.c b/t/unit-tests/t-reftable-stack.c > index aeec195b2b1014445d71c5db39a9795017fd8ff2..b23edf18a7d75b0c2292490ad06d4dfaaa571e79 100644 > --- a/t/unit-tests/t-reftable-stack.c > +++ b/t/unit-tests/t-reftable-stack.c Can we maybe add a unit test that demonstrates the error? Patrick