Re: Feature Request: Option to make "git rev-list --objects" output duplicate objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 24, 2023 at 03:51:21PM +0000, Baumann, Moritz wrote:

> …and then used the resulting list for all subsequent checks. After writing some
> unit tests, I noticed that the returned list is not sufficient: If you generate
> the exact same file twice, once with a "bad" name and once with a "good" name,
> you will only see one of those names and therefore the hook will mistakenly
> allow the push.
> 
> So, what I would want/need is an option that forces "git rev-list --objects"
> to output the object multiple times if it has multiple names in the commit
> range. Admittedly, such an option would likely only be useful for hooks that
> validate file names.

Another problem you might not have run into yet: the names given by
rev-list are not quoted in any way, and will just omit newlines. So if
your hook is trying to avoid malicious garbage like "foo\nbar", it won't
work.

Those names are really just intended as hints for pack-objects. I
suspect the documentation could be more clear about these limitations.

> Would it be feasible to implement such an option? If so, does it sound like a
> good or bad idea?
> 
> Is there any alternative for my use case that doesn't involve walking the
> commits one-by-one? (That's what we previously did and what turned out to be
> quite slow on our repository.)

I'm not sure what you mean by "one by one", since that is inherently
what rev-list is doing under the hood. If you mean "running a separate
process for each commit", then yes, that will be slow. But if you want
to know all of the names touched in a set of commits, I have used
something like this before:

  git rev-list $new --not --all |
  git diff-tree --stdin --format= -r -c --name-only

A few notes:

  - the names may be quoted if they have metacharacters; use "-z" if
    your reading side can handle it to make things simpler

  - merges are always tricky. I think "-c" will give you what you want
    (showing names which differed from any parent), but I didn't think
    too hard.  Using "-m" definitely would work, but may produce extra
    names (ones where the merge just brought together two lines of
    history, even though the commit where one of those lines touched the
    file may have been excluded via "--not --all").

  - if you are assuming the existing names are good, then probably
    --diff-filter=A would be useful, as it would show only
    newly-introduced names.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux