Re: [PATCH v3 2/2] reftable: adapt writer code to propagate block_writer_add() errors

Patrick Steinhardt <ps@xxxxxx> · Wed, 19 Mar 2025 14:18:15 +0100

On Thu, Mar 13, 2025 at 08:59:51PM +0530, Meet Soni wrote:
> On Wed, 12 Mar 2025 at 18:19, Patrick Steinhardt <ps@xxxxxx> wrote:
> >
> > > +     /*
> > > +      * The current block is full, so we need to flush and reinitialize the
> > > +      * writer to start writing the next block.
> > > +      */
> > >       arg->err = writer_flush_block(arg->w);
> > >       if (arg->err < 0)
> > >               goto done;
> >
> > But there is another case further down where we do `block_writer_add()`
> > and then re-try in case the write fails. This one is a bit more curious:
> > if the write fails, we don't create a new block -- after all we have
> > just created one. Instead, we reset the record's offset length to zero
> > before retrying.
> >
> > I _think_ that this is done because we know that when resetting the
> > offset we would write less data to the block, as can be seen in
> > `reftable_obj_record_encode()`. But I'm honestly not quite sure here as
> > I haven't yet done a deep dive into object records -- after all, we
> > don't even really use them in Git.
> >
> > In any case, I think that this callsite also needs adjustment and
> > warrants a comment. And if so, all changes to `write_object_record()`
> > should probably go into a separate commit, as well.
> >

Sorry for taking so long to respond.

> Regarding the callsite in write_object_record() where we reset the
> record's offset length to zero before retrying: my changes currently
> follow the same principle.
> 
>     - If block_writer_add() returns an error other than
>       REFTABLE_ENTRY_TOO_BIG_ERROR, we simply return.
> 
>     - For REFTABLE_ENTRY_TOO_BIG_ERROR, we flush the block and retry.
> 
>     - If that fails, we reset the record's offset length to zero and
>       then retry.

It's this last step that also needs to be adapted to check for
REFTABLE_ENTRY_TOO_BIG_ERROR, because the intent here is the exact same:
if writing the object record failed even in a completely new block then
we reset the object's offset and try a third time. But same as for the
first time, we don't check whether we get REFTABLE_ENTRY_TOO_BIG_ERROR
here and thus we might be failing and retrying even in unintended cases.

> I'm not sure what adjustments or additional comments you are referring to.
> Could you please clarify what changes you expect at this callsite?

So overall we'd add the check to both callsites, like in the below
patch.

Patrick

diff --git a/reftable/writer.c b/reftable/writer.c
index f3ab1035d61..63629e00a37 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -628,6 +628,8 @@ static void write_object_record(void *void_arg, void *key)
 	arg->err = block_writer_add(arg->w->block_writer, &rec);
 	if (arg->err == 0)
 		goto done;
+	if (arg->err != REFTABLE_ENTRY_TOO_BIG_ERROR)
+		goto done;
 
 	arg->err = writer_flush_block(arg->w);
 	if (arg->err < 0)
@@ -640,6 +642,8 @@ static void write_object_record(void *void_arg, void *key)
 	arg->err = block_writer_add(arg->w->block_writer, &rec);
 	if (arg->err == 0)
 		goto done;
+	if (arg->err != REFTABLE_ENTRY_TOO_BIG_ERROR)
+		goto done;
 
 	rec.u.obj.offset_len = 0;
 	arg->err = block_writer_add(arg->w->block_writer, &rec);