On Tue, Jul 13, 2021 at 02:17:53PM -0700, Junio C Hamano wrote: > >> Note how this commit could have been done as an optimization before > >> 88473c8bae: When our peeling hits a non-tag, we won't parse that tagged > >> object only to immediately end the loop. > > > > Yep, thanks for mentioning this, as it's somewhat subtle. > > It is too subtle that I am not sure what the paragraph wants to say. > > Before 88473c8b, we had a fully parsed object in obj and entered the > while() loop iff the outermost object is a tag, then we find the > underlying object via obj->tagged. We parse that underlying object > to find if it is a tag, and break out if it is not. > > By "this commit", I assume that the above mean the change in this > fix, i.e. parse 'obj' if it has not been parsed before looking at > its tagged field. But I am not sure how that would have been an > optimization before 88473c8b that gave a parsed tag object 'obj' > upon entry to the loop. > > Puzzled. The optimization is that we are parsing tags before looking at their structs, instead of always parsing the thing that the tag points to. So in the old loop (pseudo-code for clarity): parse_object(obj); while (obj->type == OBJ_TAG) { obj = obj->tagged; parse_object(obj); } if we had a tag pointing to a commit, we'd parse the commit. But we don't need to. We just need to know that it exists and is a commit. In the new code, we parse only when we need to look at obj->tagged: while (obj->type == OBJ_TAG) { parse_object(obj); obj = obj->tagged; } So we must "somehow" know the type of "obj" in the first place, as well as the type of every obj->tagged we look at. And that leads into your question here: > In any case, let's talk about this patch in the context to which it > is designed to be applied, i.e. post 88473c8b3c8b. > > When we come here, we have done oid_object_info() plus > lookup_object_by_type() to obtain 'obj' and we know its type. > Then we enter the loop. > > while (obj->type == OBJ_TAG) { > + if (!obj->parsed) > + parse_object(the_repository, &obj->oid); > > And we parse if it hasn't been parsed. THat is why we can ... > > obj = ((struct tag *)obj)->tagged; > if (!obj) > break; > > ... look at its tagged member. > > - if (!obj->parsed) > - parse_object(the_repository, &obj->oid); > add_name_decoration(DECORATION_REF_TAG, refname, obj); > > And the updated 'obj' (i.e. direct referent of the tag object) is > fed to add_name_decoration(). And then we move to the next > iteration. > > Now, do we know what type of object 'obj' is at this point? We > did parse the outermost tag upon entry to this loop, we replaced > 'obj' variable with the referent by following the .tagged member, > but we haven't parsed that object or ran oid_object_info() on it. > > Puzzled. ...and the answer is that we don't need to parse it. The tag object mentions the type of what it points to, and we use lookup_commit(), etc, to create the object pointed to by its "tagged" field. If we can't do that (say, because the tag is missing the type field, or we previously saw the object as a different type, etc), then obj->tagged would be NULL and we'd break out of the loop. -Peff