Re: Distinguishing FF vs non-FF updates in the reflog?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 18, 2021 at 09:58:56AM +0100, Han-Wen Nienhuys wrote:

> > 1) Not all updates make it to the reflogs
> > 2) Reflogs can be edited or mucked with
> > 3) On NFS reflogs can outright be wrong even when used properly as their are
> > caching issues. We specifically have seen entries that appear to be FFs that
> > were not.
> 
> Can you tell a little more about 3) ? SInce we don't annotate non-FF
> vs FF today, what does "appear to be FFs" mean?
> 
> But you are right: since the reflog for a branch is in a different
> file from the branch head, there is no way to do an update to both of
> them at the same time. I guess this will have to be a reftable-only
> feature.

Each individual reflog entry (in the branch reflog and the HEAD reflog)
should still be consistent, though. They give the "before" and "after"
object ids, and the ff-ness is an immutable property of those commit
ids.

> > I believe that today git can do very fast reachability checks without opening
> > pack files by using some of its indexes (bitmap code or https://git-scm.com/
> > docs/commit-graph ?). It probably makes sense to add this ability to jgit if
> > that is what you need?
> 
> The bitmaps are generated by GC, and you can't GC all the time. JGit
> has support for bitmaps, and its support actually predates C-Git's
> support for it. (It was added to JGit by Colby Ranger who worked in
> Shawn's team).

Bitmaps can help with these checks, but we don't actually look at them
in most of the algorithms one might use for computing ancestry. One of
the reasons for that is that they often backfire as an optimization,
because:

  - as you note, they are often not up to date because they require a
    repack. So they won't help when asking about very recently added
    commits (which people tend to ask about more than ancient ones).

  - the bitmap file format doesn't have any index. So a reader has to
    scan the whole thing upon opening to decide which commits have
    bitmaps.

For several years we had a patch at GitHub that checked for bitmaps
during "--contains" traversals. Even though it did sometimes backfire,
it was enough of a net win to be worth keeping, compared to actually
opening commit objects to follow their parent pointers. But with
commit-graphs, it was a strict loss, and we stopped using it entirely
last year. (We do still look at bitmaps for our branch ahead/behind
checks using a custom patch; I'm suspicious of its performance for the
same reasons, but we haven't dug carefully into it).

But...

> I expect that the commit graph doesn't work for my intended use-case.

...I think commit-graphs are a big win here. They are more often kept up
to date, because they can be generated incrementally with effort
proportional to the number of new commits. And they make a big
difference if the traversal has to cover a lot of commits. E.g., here's
the most extreme case in git.git, checking ancestry of the oldest
commit:

  $ time git merge-base --is-ancestor e83c5163316f89bfbde7d9ab23ca2e25604af290 HEAD; echo $?

  real	0m0.014s
  user	0m0.008s
  sys	0m0.005s
  0

  $ time git -c core.commitgraph=false merge-base --is-ancestor e83c5163316f89bfbde7d9ab23ca2e25604af290 HEAD; echo $?

  real	0m0.398s
  user	0m0.369s
  sys	0m0.028s
  0

Of course most results won't be so dramatic, because they wouldn't have
to traverse many commits in the first place (so they are already pretty
fast with or without the commit-graph).  But that 14ms should be an
upper bound for this repo. And naturally that scales with the number of
commits; in linux.git it's 43ms, compared to 8.7s without commit-graphs).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux