Re: Distinguishing FF vs non-FF updates in the reflog?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 22, 2021 at 03:40:46PM +0100, Han-Wen Nienhuys wrote:

> > I left some numbers in another part of the thread, but IMHO performance
> > isn't that compelling a reason to do this these days, if you are using
> > commit-graphs.
> >
> > Just walking the reflog might be _slightly_ faster, though not
> > necessarily (it depends on whether the depth of the object graph or the
> > depth of the reflog chain is deeper). It might matter more if you are
> > using a more exotic storage scheme, where switching from accessing
> > reflogs to objects implies extra round-trips to a server (e.g., custom
> > storage backends with JGit; I don't know the state of the art in what
> > Google is doing there).
> 
> JGit doesn't currently support commit-graph, so it's hard to predict
> what performance will be like, but isn't commit-graph is keyed by
> SHA1? That makes it hard to do caching, especially when considering
> large repositories.

Yes, it's keyed by sha1. It's essentially replacing "inflate the commit
object and parse it" with "here are the parsed values as mmap-able
32-bit integer fields" (there's some other stuff with generation
numbers, too, but the main speedup is simply that accessing each commit
is orders of magnitude cheaper).

It caches well, because those properties of the commit are immutable.
But if you meant "when pulling data from the commit-graph file, is it
friendly to block cache", then no, it's not linear. You'd binary search
within it to find each commit, just as you would a pack .idx (and just
like a .idx, I'd expect a system that is pulling data from a network
source to want to grab the whole commit-graph file. They tend to be much
smaller than the main .idx for a given repo).

> AFAIU, commit-graph would help speed up reachability checks, by being
> able to shortcut cases where the commit number proves that some commit
> is not ancestor of the other, but you still have to do a revwalk to
> conclusively prove reachability.

Right. You'll still walk a lot of the commits, but you'll do so much
faster (the generation numbers can also help prune some uninteresting
side paths, but again, I think the main value for this operation is just
getting the parent info much faster).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux