Re: [PATCH v3 0/6] Create commit-graph file format v2

SZEDER Gábor <szeder.dev@xxxxxxxxx> · Fri, 3 May 2019 16:16:14 +0200

On Fri, May 03, 2019 at 08:47:25AM -0400, Derrick Stolee wrote:
> It would be much simpler to restrict the model. Your idea of changing
> the file name is the inspiration here.
> 
> * The "commit-graph" file is the base commit graph. It is always
>   closed under reachability (if a commit exists in this file, then
>   its parents are also in this file). We will also consider this to
>   be "commit-graph-0".
> 
> * A commit-graph-<N> exists, then we check for the existence of
>   commit-graph-<N+1>. This file can contain commits whose parents
>   are in any smaller file.
> 
> I think this resolves the issue of back-compat without updating
> the file format:
> 
> 1. Old clients will never look at commit-graph-N, so they will
>    never complain about an "incomplete" file.
> 
> 2. If we always open a read handle as we move up the list, then
>    a "merge and collapse" write to commit-graph-N will not
>    interrupt an existing process reading that file.

What if a process reading the commit-graph files runs short on file
descriptors and has to close some of them, while a second process is
merging commit-graph files?

> I'll start hacking on this model.

Have fun! :)

Semi-related, but I'm curious:  what are your plans for 'struct
commit's 'graph_pos' field, and how will it work with multiple
commit-graph files?

In particular: currently we use this 'graph_pos' field as an index
into the Commit Data chunk to find the metadata associated with a
given commit object.  But we could add any commit-specific metadata in
a new chunk, being an array ordered by commit OID, and then use
'graph_pos' as an index into this chunk as well.  I find this quite
convenient.  However, with mulitple commit-graph files there will be
multiple arrays...