Re: [PATCH] rev-list: print missing object type with --missing=print-type

Junio C Hamano <gitster@xxxxxxxxx> · Wed, 08 Jan 2025 14:43:53 -0800

Justin Tobler <jltobler@xxxxxxxxx> writes:

>> As I suspect that we would want to leave the door open for us to
>> extend this later, I would perhaps suggest an output format format
>> like:
>> 
>>     ?<object name> [<token>=<value>]...
>
> I think this is a great idea. To select which attributes get printed
> with the missing object we could add an option. Something like:
>
>   $ git rev-list --objects --missing=print \
>   --missing-attr=path --missing-attr=type

My knee-jerk reaction is that this is over-engineered; wouldn't it
be possible for us to simply dump everything we know about the
object, and let the receiving end pick and choose?

> I like the idea of also adding a path attribute, but this raises a
> couple of questions. The way `--missing=print` currently works is that
> it prints the unique set of missing object IDs. A missing object could
> possibly be referenced by multiple trees and thus have multiple valid
> paths.

That is not an issue at all, I think.  "rev-list --objects" that
shows objects that are not missing already has the same issue, and
the solution is "show the path when the object gets shown for the
first time".  Even when the same object is encountered during the
history-and-then-tree walk later, that object is simply not listed.

The code path that collects "I thought this blob should exist
because a tree wants to see it at this path, but the repository is
corrupt and I cannot see it there" into the missing object table
with attributes should do the same.  If the table does not yet have
the object, record the attributes (like "expected type", "path at
which the object was found") when inserting the object into the
table for the first time.  If you have a missing object and the
table already has it recorded there, don't do anything extra.