Re: [PATCH v3 4/9] commit-graph.txt: update design document

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Derrick Stolee <dstolee@xxxxxxxxxxxxx> writes:

> We now calculate generation numbers in the commit-graph file and use
> them in paint_down_to_common().

All right.

>
> Expand the section on generation numbers to discuss how the three
> special generation numbers GENERATION_NUMBER_INFINITY, _ZERO, and
> _MAX interact with other generation numbers.

Very good.

>
> Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
> ---
>  Documentation/technical/commit-graph.txt | 30 +++++++++++++++++++-----
>  1 file changed, 24 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/technical/commit-graph.txt b/Documentation/technical/commit-graph.txt
> index 0550c6d0dc..d9f2713efa 100644
> --- a/Documentation/technical/commit-graph.txt
> +++ b/Documentation/technical/commit-graph.txt
> @@ -77,6 +77,29 @@ in the commit graph. We can treat these commits as having "infinite"
>  generation number and walk until reaching commits with known generation
>  number.
>
> +We use the macro GENERATION_NUMBER_INFINITY = 0xFFFFFFFF to mark commits not
> +in the commit-graph file. If a commit-graph file was written by a version
> +of Git that did not compute generation numbers, then those commits will
> +have generation number represented by the macro GENERATION_NUMBER_ZERO = 0.

I have to wonder if there would be any relesed Git that do not compute
generation numbers...

On the other hand in case the user-visible view of the project history
changes, be it because shallow clone is shortened or deepened, or grafts
file is edited, or a commit object is replaced with another with
different parents - we can still use "commit-graph" data, just pretend
that generation numbers (which are invalid in altered history) are all
zero.  (I'll write about this idea in comments to later series.)

On the other hand with GENERATION_NUMBER_ZERO these series of patches
are self-contained and bisectable.

> +
> +Since the commit-graph file is closed under reachability, we can guarantee
> +the following weaker condition on all commits:

I have had to look up the contents of the whole file, but it turns out
that it is all right: "weaker condition" refers to earlier "N <= M".

Minor sidenote: if one would be extremly pedantic, one could say that
previous condition is incorrect, because it doesn't state explicitely
that commit A != commit B. ;-)

> +
> +    If A and B are commits with generation numbers N amd M, respectively,
> +    and N < M, then A cannot reach B.
> +
> +Note how the strict inequality differs from the inequality when we have
> +fully-computed generation numbers. Using strict inequality may result in
> +walking a few extra commits, but the simplicity in dealing with commits
> +with generation number *_INFINITY or *_ZERO is valuable.
> +
> +We use the macro GENERATION_NUMBER_MAX = 0x3FFFFFFF to for commits whose
> +generation numbers are computed to be at least this value. We limit at
> +this value since it is the largest value that can be stored in the
> +commit-graph file using the 30 bits available to generation numbers. This
> +presents another case where a commit can have generation number equal to
> +that of a parent.

I wonder if something like the table I have proposed in v2 version of
this patch [1] would make it easier or harder to understand.

[1]: https://public-inbox.org/git/86a7u7mnzi.fsf@xxxxxxxxx/

Something like the following:

             |                      gen(B)
             |
gen(A)       | _INFINITY | _MAX     | larger   | smaller  | _ZERO
-------------+-----------+----------+----------+----------+--------
_INFINITY    | =         | >        | >        | >        | >
_MAX         | < N       | =        | >        | >        | >
larger       | < N       | < N      | = n      | >        | >
smaller      | < N       | < N      | < N      | = n      | >
_ZERO        | < N       | < N      | < N      | < N      | =

Here "n" and "N" denotes stronger condition, and "N" denotes weaker
condition.  We have _INFINITY > _MAX > larger > smaller > _ZERO.

> +
>  Design Details
>  --------------
>
> @@ -98,17 +121,12 @@ Future Work
>  - The 'commit-graph' subcommand does not have a "verify" mode that is
>    necessary for integration with fsck.
>
> -- The file format includes room for precomputed generation numbers. These
> -  are not currently computed, so all generation numbers will be marked as
> -  0 (or "uncomputed"). A later patch will include this calculation.
> -
>  - After computing and storing generation numbers, we must make graph
>    walks aware of generation numbers to gain the performance benefits they
>    enable. This will mostly be accomplished by swapping a commit-date-ordered
>    priority queue with one ordered by generation number. The following
> -  operations are important candidates:
> +  operation is an important candidate:
>
> -    - paint_down_to_common()
>      - 'log --topo-order'
>
>  - Currently, parse_commit_gently() requires filling in the root tree

Looks good.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux