Re: Possible bug in git describe, additional commits differs when cloned with --depth

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 27, 2019 at 11:51:07AM +0200, Anders Janmyr wrote:
> Hi,
> 
> I'm not sure if this is a bug or not but `git describe` gives
> different results when the repo has been cloned with `--depth` or not.
> 
> In the example below from the git repository the number of additional
> commits since the
> last tag differs 256 vs. 265.
> 
> ```
> $ git clone https://github.com/git/git
> $ cd git/
> $ git describe
> v2.23.0-256-g4c86140027
> $ git rev-list -n 1 HEAD
> 4c86140027f4a0d2caaa3ab4bd8bfc5ce3c11c8a
> 
> 
> $ git clone --depth=50 https://github.com/git/git git-depth
> $ cd git-depth/
> $ git describe
> v2.23.0-265-g4c861400
> $ git rev-list -n 1 HEAD
> 4c86140027f4a0d2caaa3ab4bd8bfc5ce3c11c8a
> ```

I don't think this is a bug, but rather an inherent limitation of
shallow histories with lots of merges, and it affects not only 'git
describe', but any limited history traversal.

In the Git project new features are developed on their dedicated
branches, which are then eventually merged to 'master'.  Alas, we make
mistakes, and sometimes we realize that a feature was buggy after it
has already been merged to 'master'.  In such cases the bugfix is
often applied not on top of 'master', but on top of the feature
branch, so it can be merged to maintenance releases as well.

This results in a history like this:

  M2     Merge the bugfix to 'master'
  |  \
  |   \
 v2.0  |
  |    o  Bugfix for new feature
 CO2   |
  |    |
  M1  /  Merge 'new feature' to 'master'
  | \/
  |  o   new feature
  |  |
  |  o
  |  |
  | CO1
  |  |    
  | /
 v1.0

Describing M2 in a full repository results in something like
v2.0-2-gdeadbeef, because M2 contains only two commits that aren't in
v2.0, (M2 and the bugfix).

Now let's suppose that in a shallow repo the given '--depth=<N>'
resulted in a cutoff at commits CO1 and CO2, meaning that the shallow
repo does not include commits M1 and v1.0.  Consequently, Git can't
possibly see that the commits implementing the new feature are already
merged and thus reachable from v2.0, so it will count those commits as
well, resulting in v2.0-5-gabcdef.

There is a lot more going on in the Git repository, so it's not as
simple as above.  Case in point is the merge d1a251a1fa (Merge branch
'en/checkout-mismerge-fix', 2019-09-09), which merges a fix to a bug
that happened before v2.22.0-rc0, but that bug was not introduced in
the feature branch, but while merging that branch to 'master'.  The
result is still the same, though: since there are a lot of commits on
the ancestry path between that buggy merge and v2.23.0, '--depth=50'
doesn't include them all in the shallow clone, so Git can't possibly
know that that merge is reachable from v2.23.0.

  # same in both the full and shallow repos
  $ git log --oneline v2.23.0..d1a251a1fa^ |wc -l
  57

  # in the full repo
  $ git log --oneline v2.23.0..d1a251a1fa |wc -l
  59

  # in the shallow repo
  $ git log --oneline v2.23.0..d1a251a1fa |wc -l
  132

> In the example above the first version also gives additional digits
> for the SHA,
> g4c86140027 vs. g4c861400, but that is not always the case.

Git shows as many hexdigits as needed to form a unique object name
with a few additional digits worth of safety margin.  There are a lot
more objects in the full repository than in the shallow clone, which
means more hexdigits in the abbreviated object name.


Thanks for letting us know.  I think this is worth a warning in
the documentation of 'git clone --depth'.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux