Re: [PATCH v2 1/3] refs: keep track of unresolved reference value in iterators

Patrick Steinhardt <ps@xxxxxx> · Mon, 5 Aug 2024 12:59:54 +0200

On Thu, Aug 01, 2024 at 09:41:03AM -0700, Junio C Hamano wrote:
> "John Cai via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
> > @@ -886,6 +889,9 @@ static int files_ref_iterator_advance(struct ref_iterator *ref_iterator)
> >  		iter->base.refname = iter->iter0->refname;
> >  		iter->base.oid = iter->iter0->oid;
> >  		iter->base.flags = iter->iter0->flags;
> > +		if (iter->iter0->flags & REF_ISSYMREF)
> > +			iter->base.referent = iter->iter0->referent;
> 
> Presumably base.referent is initialized to NULL so this "if"
> statement does not need an else clause?

This function typically ends up being called in a loop though. So
without the else clause, wouldn't we potentially leak the value of a
preceding ref into subsequent iterations like this?

> I am primarily wondering why this needs to be conditional.  We are
> making verbatim copy of the flags word from *iter->iter0 to
> iter->base; if iter0 is symref we want to mark base also as symref,
> If iter0 is not a symref, then we want to mark base also not a
> symref, presumably.  So wouldn't it make sense to just say
> 
> 		iter->base.referent = iter->iter0->referent;
> 
> here, regardless of what iter->iter0->flags say about symref-ness of
> the thing?  Because ...

> > diff --git a/refs/iterator.c b/refs/iterator.c
> > index d355ebf0d59..26acb6bd640 100644
> > --- a/refs/iterator.c
> > +++ b/refs/iterator.c
> > @@ -7,6 +7,7 @@
> >  #include "refs.h"
> >  #include "refs/refs-internal.h"
> >  #include "iterator.h"
> > +#include "strbuf.h"
> >  
> >  int ref_iterator_advance(struct ref_iterator *ref_iterator)
> >  {
> > @@ -29,6 +30,7 @@ void base_ref_iterator_init(struct ref_iterator *iter,
> >  {
> >  	iter->vtable = vtable;
> >  	iter->refname = NULL;
> > +	iter->referent = NULL;
> >  	iter->oid = NULL;
> >  	iter->flags = 0;
> >  }
> > @@ -199,6 +201,7 @@ static int merge_ref_iterator_advance(struct ref_iterator *ref_iterator)
> >  		}
> >  
> >  		if (selection & ITER_YIELD_CURRENT) {
> > +			iter->base.referent = (*iter->current)->referent;
> >  			iter->base.refname = (*iter->current)->refname;
> >  			iter->base.oid = (*iter->current)->oid;
> >  			iter->base.flags = (*iter->current)->flags;
> 
> ... other parts of the API seem to follow that philosophy.
> 
> In other words, the invariant of this code is that .referent is
> non-NULL if and only if the ref is a symref, and that invariant is
> trusted without checking with .flags member.  But the earlier hunk
> that copied iter0 to base did not seem to be using that invariant,
> which made it look inconsistent.

Agreed.

> >  struct ref_entry *create_ref_entry(const char *refname,
> > +				   const char *referent,
> >  				   const struct object_id *oid, int flag)
> >  {
> >  	struct ref_entry *ref;
> > @@ -41,6 +43,10 @@ struct ref_entry *create_ref_entry(const char *refname,
> >  	FLEX_ALLOC_STR(ref, name, refname);
> >  	oidcpy(&ref->u.value.oid, oid);
> >  	ref->flag = flag;
> > +
> > +	if (flag & REF_ISSYMREF)
> > +		ref->u.value.referent = xstrdup_or_null(referent);
> 
> OK.  ref_value now has referent next to the existing oid, but that
> member only makes sense when flag says it is a symref.
> 
> Curiously, that information is missing from ref_value struct, so by
> looking at a ref_value alone, we cannot tell if we should trust the
> value in the .referent member?
> 
> Makes me wonder if we should follow the same "ignore what the flag
> says when filling the .referent member; if the ref is not a symref,
> the referent variable is NULL, and if it is, referent is never NULL"
> pattern?  Then ref->u.value.referent is _always_ defined---the
> current code says "the u.value.referent member is undefined for ref
> that is not a symref", but with the suggested change, it will be
> "the u.value.referent member is NULL for ref that is not a symref,
> and for a symref, it is the value of the symref".

Yeah, I think that would be preferable indeed.

Patrick
Attachment:
signature.asc

Description: PGP signature