Re: [PATCH v8 04/14] graph: add commit graph design document

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Derrick Stolee <stolee@xxxxxxxxx> writes:

> +Future Work
> +-----------
> +
> +- The commit graph feature currently does not honor commit grafts. This can
> +  be remedied by duplicating or refactoring the current graft logic.

The problem in my opinion lies in different direction, namely that
commit grafts can change, changing the view of the history.  If we want
commit-graph file to follow user-visible view of the history of the
project, it needs to respect current version of commit grafts - but what
if commit grafts changed since commit-graph file was generated?

Actually, there are currently three ways to affect the view of the
history:

a. legacy commit grafts mechanism; it was first, but it is not safe,
   cannot be transferred on fetch / push, and is now deprecated.

b. shallow clones, which are kind of specialized and limited grafts;
   they used to limit available functionality, but restrictions are
   being lifted (or perhaps even got lifted)

c. git-replace mechanism, where we can create an "overlay" of any
   object, and is intended to be among others replacement for commit
   grafts; safe, transferable, can be turned off with "git
   --no-replace-objects <command>"

All those can change; some more likely than others.  The problem is if
they change between writing commit-graph file (respecting old view of
the history) and reading it (where we expect to see the new view).

a. grafts file can change: lines can be added, removed or changed

b. shallow clones can be deepened or shortened, or even make
   not shallow

c. new replacements can be added, old removed, and existing edited


There are, as far as I can see, two ways of handling the issue of Git
features that can change the view of the project's history, namely:

 * Disable commit-graph reading when any of this features are used, and
   always write full graph info.

   This may not matter much for shallow clones, where commit count
   should be small anyway, but when using git-replace to stitch together
   current repository with historical one, commit-graph would be
   certainly useful.  Also, git-replace does not need to change history.

   On the other hand I think it is the easier solution.

Or

 * Detect somehow that the view of the history changed, and invalidate
   commit-graph (perhaps even automatically regenerate it).

   For shallow clone changes I think one can still use the old
   commit-graph file to generate the new one.  For other cases, the
   metadata is simple to modify, but indices such as generation number
   would need to be at least partially calculated anew.

Happily, you don't need to do this now.  It can be done in later series;
on the other hand this would be required before the switch can be turned
from "default off" to "default on" for commit-graph feature (configured
with core.commitGraph).

So please keep up the good work with sending nicely digestible patch
series.

> +
> +- The 'commit-graph' subcommand does not have a "verify" mode that is
> +  necessary for integration with fsck.

The "read" mode has beginnings of "verify", or at least "fsck", isn't
it?

[...]
> +- The current design uses the 'commit-graph' subcommand to generate the graph.
> +  When this feature stabilizes enough to recommend to most users, we should
> +  add automatic graph writes to common operations that create many commits.
> +  For example, one could compute a graph on 'clone', 'fetch', or 'repack'
> +  commands.

Automatic is good ("git gc --auto").

But that needs handling of view chaning features such as commit grafts,
isn't it?

> +
> +- A server could provide a commit graph file as part of the network protocol
> +  to avoid extra calculations by clients. This feature is only of benefit if
> +  the user is willing to trust the file, because verifying the file is correct
> +  is as hard as computing it from scratch.

Should server send different commit-graph file / info depending on
whether client fetches from refs/replaces/* nameespace?

> +
> +Related Links
> +-------------

Thank you for providing them (together with summary).

> +[0] https://bugs.chromium.org/p/git/issues/detail?id=8
> +    Chromium work item for: Serialized Commit Graph
> +
> +[1] https://public-inbox.org/git/20110713070517.GC18566@xxxxxxxxxxxxxxxxxxxxx/
> +    An abandoned patch that introduced generation numbers.
> +
> +[2] https://public-inbox.org/git/20170908033403.q7e6dj7benasrjes@xxxxxxxxxxxxxxxxxxxxx/
> +    Discussion about generation numbers on commits and how they interact
> +    with fsck.
> +
> +[3] https://public-inbox.org/git/20170908034739.4op3w4f2ma5s65ku@xxxxxxxxxxxxxxxxxxxxx/
> +    More discussion about generation numbers and not storing them inside
> +    commit objects. A valuable quote:
> +
> +    "I think we should be moving more in the direction of keeping
> +     repo-local caches for optimizations. Reachability bitmaps have been
> +     a big performance win. I think we should be doing the same with our
> +     properties of commits. Not just generation numbers, but making it
> +     cheap to access the graph structure without zlib-inflating whole
> +     commit objects (i.e., packv4 or something like the "metapacks" I
> +     proposed a few years ago)."
> +
> +[4] https://public-inbox.org/git/20180108154822.54829-1-git@xxxxxxxxxxxxxxxxx/T/#u
> +    A patch to remove the ahead-behind calculation from 'status'.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux