Re: [Bug] In `git-rev-list(1)`, using the `--objects` flag doesn't work well with the `--not` flag, as non-commit objects are not excluded

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 15, 2023 at 9:31 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
> > If you notice here, the objects
> > `8baef1b4abc478178b004d62031cf7fe6db6f903`,
> > `086885f71429e3599c8c903b0e9ed491f6522879` and
> > `7a67abed5f99fdd3ee203dd137b9818d88b1bafd` are included in the output,
> > these objects are reachable from
> > `91fa9611a355db77a07f963c746d57f75af380da` and shouldn't have been
> > included since we used the `--not` flag.
>
> For performance reasons, we cannot afford to enumerate all objects
> that are reachable from --not objects and exclude them from the
> output.  So it is a balancing act to decide where to draw the line.
>
> Spending more cycles and heaps for traversing the --not side deeper
> may make the --objects output smaller and more precise, but there of
> course is cost associated with it.  And --objects do not promise
> that it gives absolute minimum---the reason it exists is to make
> sure the objects listed are sufficient to fill gaps between the
> --not tips and positive ones, which is the primary reason for its
> existence.
>

I understand this, this makes sense from how rev-list is internally
used. But from a user
facing command perspective, it's not doing what it says. So either we
add to the docs
saying that there are quirks using `--not` with `--objects`  or we fix this.

So I would still consider it a bug in that sense.

> > The diff below fixes the issue:
> >
> > @@ -3790,11 +3833,12 @@ int prepare_revision_walk(struct rev_info *revs)
> >          commit_list_sort_by_date(&revs->commits);
> >      if (revs->no_walk)
> >          return 0;
> > -    if (revs->limited) {
> > +    if (revs->limited && !revs->tree_objects) {
> >          if (limit_list(revs) < 0)
> >              return -1;
> >          if (revs->topo_order)
>
> This might change the size of the output and "fix" the output in
> your particular small test case, but I am not sure what kind of bugs
> this will introduce in the more general codepath.
>
> Not calling limit_list() when the .limited bit is on is breaking one
> of the most fundamental assumptions in the revision traversal.  When
> a feature is enabled that needs to paint the graph upfront before it
> can compute its result, the code that enables the feature flips the
> .limited to ask this part of the code to make sure it calls
> limit_list() to paint the graph with UNINTERESTING bit.
>
> This area to paint uninteresting trees have changed quite
> drastically at d5d2e935 (revision: implement sparse algorithm,
> 2019-01-16).  Some of what it removed may be contributing the "over
> counting" of trees that are relevant in your example.
>

I never proposed this as the solution, this was merely me trying to understand
why it is working the way it is, hence calling it a _naive_ fix.

> The differences in commit object names become distracting, but I do
> not think your example depends on them, so a minimum reproduction
> recipe should be
>
>         $ rm -fr new ; git init new ; cd new
>         $ echo foo >foo
>         $ git add -A; git commit -m one; git rev-parse HEAD:
>         205f6b799e7d5c2524468ca006a0131aa57ecce7
>         $ echo moo >moo
>         $ git add -A; git commit -m two; git rev-parse HEAD:
>         672d0aa883d095369c56416587bc397eee4ac37e
>         $ mkdir bar; echo goo >bar/goo; echo abc >abc
>         $ git add -A; git commit -m three; git rev-parse HEAD:
>         0c16a6cc9eef3fdd3034c1ffe2fc5e6d0bba2192
>         $ git rm moo; git commit -m four; git rev-parse HEAD:
>         ff05824d2f76436c61d2c971e11a27514aba6948
>
>         $ git rev-list --objects HEAD^..HEAD
>         5a1b93c9c4c0c9e5c969f8e0b92a02184f8f9aab
>         ff05824d2f76436c61d2c971e11a27514aba6948
>
> that output lists HEAD and HEAD^{tree} in this order, which seems to
> be what you are expecting.
>
> I am consistently getting the same result with Git 2.42-rc2, 2.41,
> 2.21, and 2.20 (I do not have ones older than 2.20 around).

The provided reproduction recipe unfortunately uses a linear history and
therefore, is not the same as the example provided by me. Here is a reproducible
recipe following the same commands you used:

$ rm -fr new ; git init new ; cd new
$ echo foo >foo
$ git add -A; git commit -m one; git rev-parse HEAD
26fb965d7439c1760677377bf314d8933de0b716
$ mkdir bar; echo goo >bar/goo
$ git add -A; git commit -m two; git rev-parse HEAD
$ git checkout -B branch
$ git reset --hard @~1
HEAD is now at 26fb965 one
$ git add -A; git commit -m three; git rev-parse HEAD
91ef508167eb683486c3df6f8d07622b61ed698d

$ git rev-list --objects HEAD ^master
91ef508167eb683486c3df6f8d07622b61ed698d
ff05824d2f76436c61d2c971e11a27514aba6948
8baef1b4abc478178b004d62031cf7fe6db6f903 abc
086885f71429e3599c8c903b0e9ed491f6522879 bar
7a67abed5f99fdd3ee203dd137b9818d88b1bafd bar/goo

$ git cat-file -p HEAD^{tree}
100644 blob 8baef1b4abc478178b004d62031cf7fe6db6f903    abc
040000 tree 086885f71429e3599c8c903b0e9ed491f6522879    bar
100644 blob 257cc5642cb1a054f08cc83f2d943e56fd3ebe99    foo

$ git cat-file -p master^{tree}
040000 tree 086885f71429e3599c8c903b0e9ed491f6522879    bar
100644 blob 257cc5642cb1a054f08cc83f2d943e56fd3ebe99    foo

> So, there is something else going on in *your* build of Git, or the
> repository you prepared for testing, or both.
>

$ git version
git version 2.41.0

So the issue definitely exists, this is on an unmodified git. You can see that
the objects `086885f71429e3599c8c903b0e9ed491f6522879` and
`7a67abed5f99fdd3ee203dd137b9818d88b1bafd` are still shown (which shouldn't
have been shown).




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux