Re: [PATCH 3/3] cat-file: avoid verifying submodules' OIDs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 12, 2024 at 11:35:16AM -0700, Junio C Hamano wrote:

> I actually have to wonder if the new behaviour proposed by this
> patch is a solution that is in search of a problem, or trying to
> solve an unstated problem in a wrong way.
> 
>     O=$(git rev-parse --verify :sha1collisiondetection)
>     git cat-file -t "$O"
>     
> should fail because the object whose name is $O is not available.
> Why should then this succeed and give a different result?
> 
>     git cat-file -t :sha1collisiondetection
> 
> The "cat-file" command is about objects.  While object's type may
> sometimes be inferrable (by being contained in a tree), if the user
> asks us to determine the type of the object, we should actually hit
> the object store, whether the commit object in question happens to
> be on our history or somebody else's history that our gitlink points
> at.
> 
> So, I am not yet convinced that I should take this patch.  Previous
> two steps looked good, though.

I'm not sure about "-t" in particular, but for batch output, I think if
we stop at patch 2 it would be impossible to tell the difference between
a submodule entry and a corrupt repo (or bad request). E.g., if I do
this:

  (echo HEAD:Makefile; echo HEAD:sha1collisiondetection) |
  git cat-file --batch-check='%(objectname) %(objectmode)'

after only patch 2, I'd get:

  4e255c81f22386389c7460d8f5e59426673b5a5a 100644
  HEAD:sha1collisiondetection missing

We can't tell if HEAD didn't resolve, or it doesn't have that path, or
if it's a regular blob entry and the repository is corrupt. Whereas
after patch 3, we get:

  4e255c81f22386389c7460d8f5e59426673b5a5a 100644
  855827c583bc30645ba427885caa40c5b81764d2 160000

and the mode tells us that we resolved it to a submodule.

The current behavior is not too surprising for cat-file, since it's
whole purpose is to give you information about the objects themselves,
and we don't have one here. But with this %(objectmode) format, we're
really moving into a realm of "resolve this name for me and show me the
context". We don't care about the details of the object at all!

I think you could make an argument that the problem is shoe-horning new,
slightly-mismatched functionality into cat-file. But there are lots of
practical reasons to want to do so, as we discussed elsewhere. Since
gitlinks are the only place where we'd expect an object to be missing,
"simulating" them here isn't too bad. But I suspect there's a more
general solution where cat-file learns to print dummy values for any
missing object, letting the caller see what we _could_ find out. And
then the submodule case just falls out naturally. I doubt we could make
it the default for historical compatibility; we'd need a new option.

This is all speculative on my part, of course. Probably Dscho or
Victoria can explain their use case better. :)

-Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux