Re: [PATCH v5 08/18] blame: emit a better error on 'git blame directory'

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Fri, 02 Apr 2021 11:26:01 +0200

On Thu, Apr 01 2021, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@xxxxxxxxx> writes:
>
>> Change an early check for non-blobs in verify_working_tree_path() to
>> let any such objects pass, and instead die shortly thereafter in the
>> fake_working_tree_commit() caller's type check.
>>
>> Now e.g. doing "git blame t" in git.git emits:
>>
>>     fatal: unsupported file type t
>>
>> Instead of:
>>
>>     fatal: no such path 't' in HEAD
>
> Sorry, but I fail to see why "unsupported file type t" is quite an
> improvement.  Is this one of these irrelevant clean-up while at it
> whose benefit is unclear until much later, I have to wonder.

Because "t" is directory we can stat() and which exists in the index, so
it makes more sense to fall through to the stat() codepath.

I think the "unsupported file type" message is a bit odd, but it's the
existing one, perhaps changing it while we're at it to something like:

    fatal: cannot 'blame' a directory

Would be better, but in any case I think saying "no such path X in HEAD"
when you can "git show HEAD:t" to see that there is such a path doesn't
make sense.

I don't have a test for it here, but this change also makes this error
better:

    rm -rf contrib
    git blame contrib

Before we'd say:

    fatal: no such path 'contrib' in HEAD

But now we'll fall back to:

    fatal: Cannot lstat 'contrib': No such file or directory

Which could also be reworded, but aside from the specific wording I
think not aborting early when we see "this is not a blob" is better.

>> The main point of this test is to assert that we're not doing
>> something uniquely bad when in a conflicted merge. See
>
> "this test" refers to the logic "it is OK to skip the check if one
> of the parents does have it as a blob", introduced in 9aeaab68
> (blame: allow "blame file" in the middle of a conflicted merge,
> 2012-09-11)?

Yes, will clarify.

>> -		if (!get_tree_entry(r, commit_oid, path, &blob_oid, &mode) &&
>> -		    oid_object_info(r, &blob_oid, NULL) == OBJ_BLOB)
>> +		if (!get_tree_entry(r, commit_oid, path, &blob_oid, &mode))
>>  			return;
>>  	}
>
> At least, the original logic makes sense to me in that if an early
> parent has the path as a directory we do not declare it is OK but
> keep going until we find a blob in a later parent before deciding to
> short-cut.  I am not sure what the updated "in this case we can
> bypass the real check" condition even means.  Mechanically, it says
> "if any parent has the path as any filesystem entity, even if it
> were a directory, then it is OK", but why?

Because we'll fall down to code that's better at doing the rest of that
check.

Looking at this again, another thing this changes is the behavior of
--contents, which I again think is an improvement. Let's say you:

    rm Makefile &&
    mkdir Makefile &&
    touch Makefile/foo &&
    git add Makefile &&
    git commit -m"foo"

I.e. turn a random file into a directory, then:

    git show origin/master:Makefile | git blame --contents - Makefile

We'll now say:

    fatal: no such path 'Makefile' in HEAD

With my patch we'll do what the user asked (and I think consistently
with the documentation) and pretend as though the stream on stdin was
the contents at the "Makefile" path.

The blame we show doesn't make much sense, it's all the lines in the
file with "Not Committed Yet", but that's another matter to do with the
blame algorithm in general, e.g. if you flip Makefile back & forth
between a file->dir->file it won't traverse past the dir->file move.