Re: bug: git name-rev --stdin --no-undefined on detached head

John Cai <jcai@xxxxxxxxxx> · Fri, 24 Dec 2021 12:09:46 -0800

> On Dec 24, 2021, at 11:42 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> 
> John Cai <jcai@xxxxxxxxxx> writes:
> 
>> It seems like this bug can be generalized to “git name-rev
>> --stdin” does not work with --no-undefined nor --name-only
>> 
>> The --name-only case seems clear to me that we should fix
>> it. It’s misleading to return the sha instead of “undefined”
>> for a rev without a symbolic name, as a sha could be a symbolic
>> name.
>> 
>> I think we can also make the argument that --no-undefined should
>> also die in --stdin mode when given a rev without any symbolic
>> names.
> 
> Hmph, the manual page documents:
> 
>    --stdin::
>            Transform stdin by substituting all the 40-character SHA-1
>            hexes (say $hex) with "$hex ($rev_name)".  When used with
>            --name-only, substitute with "$rev_name", omitting $hex
>            altogether.  Intended for the scripter's use.
> 
> It is unfortunate that the way this option works is confusingly a
> bit different from what we learned to expect from the --stdin option
> other subcommands like "git pack-objects --stdin" takes.  In short,
> these are not equivalent:
> 
> 	git name-rev [<options>] $string
>        printf "%s" "$string" | xargs git name-rev [<options>]
>        printf "%s" "$string" | git name-rev --stdin [<options>]
> 
> The first two are supposed to be the equivalent, but the third one
> is different by design.  Its `--stdin` mode is expected to read
> something like this [*]:
> 
> 	$ cat sample.txt
>        A revision that exists 2ae0a9cb82 is shown here,
>        and its full name is 2ae0a9cb8298185a94e5998086f380a355dd8907
>        while its tree object is 70d105cc79e63b81cfdcb08a15297c23e60b07ad
>        which probably is undescribable hexdigits.
> 
> and its designed use is to annotate its input into a more reader
> friendly from with refnames where possible.  

No wonder! This elucidates why I found the user experience of the --stdin 
mode a bit unexpected, as it would simply echo back the input when it couldn’t find a 
valid object or refname rather than return an error message. I think I totally missed 
the verbiage in the documentation that states its meant to **substitute** text in stdin.
Makes sense to echo back the input if nothing useful could be found.

> Here is what we get:
> 
>        $ git name-rev --stdin <sample.txt
>        A revision that exists 2ae0a9cb82 is shown here,
>        and its full name is 2ae0a9cb8298185a94e5998086f380a355dd8907 (master)
>        while its tree object is 70d105cc79e63b81cfdcb08a15297c23e60b07ad
>        which probably is undescribable hexdigits.
> 
> I notice a few things.
> 
> * An abbreviated commit object name is not affected;
> 
> * A 40-digit string that cannot be described with a reference is
>   left alone, without "undefined".
> 
> It might be debatable that the latter may want to be annotated with
> "undefined", but as the command does not molest other noise strings
> like "its" "full" name" in the input, I think the current behaviour
> is preferred over appending "(undefined)" after a string we do not
> recognize that happens to be 40-hex.
> 
> When used with --name-only, we see this:
> 
> 	$ git name-rev --name-only --stdin <sample.txt
>        A revision that exists 2ae0a9cb82 is shown here,
>        and its full name is master
>        while its tree object is 70d105cc79e63b81cfdcb08a15297c23e60b07ad
>        which probably is undescribable hexdigits.
> 
> So, as far as I can see, it is working as described.  If there is
> any bug in the things I saw and shown here, it is that it is
> misleading to claim that this behaviour is intended for scripter's
> use.  It clearly is not scripter friendly when you want to run
> "name-rev" on unbounded number of object names you have, which may
> not fit on the command line, as that is not how it was designed to
> be used.
> 
> Two possible things we can do to improve are
> 
> * Fix the documentation; it is not for scripters but for annotating
>   text with object names.

Makes sense to update the documentation to make it clear what --stdin mode is
meant to do. 

> 
> * Possibly add --names-from-standard-input option that would behave
>   more like "we cannot afford to stuff all object names on the
>   command line, so we feed them one by one from the standard input"
>   mode the "--stdin" option of other subcommands use.
> 
> I do not think the latter is so important, as it is perfectly OK to
> use xargs to split the large input into multiple invocations of
> name-rev.  This is unlike "pack-objects --stdin" where the command
> needs to see _all_ input in a single invocation.
> 
> 
> [Footnote]
> 
> * The sample input was produced with
> 
>        $ cat >sample.txt <<EOF
>        A revision that exists $(git rev-parse --short HEAD) is shown here,
>        and its full name is $(git rev-parse HEAD)
>        while its tree object is $(git rev-parse HEAD:)
>        which probably is undescribable hexdigits.
>        EOF
> 
> if you want to try it at home ;-)