Re: [PATCH v2 01/10] technical doc: add a design doc for the evolve command

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Chris!

Chris Poucet <poucet@xxxxxxxxxx> writes:

> One thing that is not clear to me is whether this is the desired
> direction. I took at look at the git review notes but it was hard to
> get a sense of where people are at.

I'm really sorry, I meant to get back to this sooner with the takeaways
from Review Club. Hopefully this will still be useful.

You can find the Review Club notes here:

  https://docs.google.com/document/d/14L8BAumGTpsXpjDY8VzZ4rRtpAjuGrFSRqn3stCuS_w/edit?pli=1

> Would love input on the design.

Others have given a lot of input on the design, so instead, I'll focus
mostly on how to make the doc better on the mailing list.

>
> On Wed, Oct 5, 2022 at 4:59 PM Stefan Xenos via GitGitGadget
> <gitgitgadget@xxxxxxxxx> wrote:
>>
>> From: Stefan Xenos <sxenos@xxxxxxxxxx>
>>
>> This document describes what a change graph for
>> git would look like, the behavior of the evolve command,
>> and the changes planned for other commands.
>>
>> It was originally proposed in 2018, see
>> https://public-inbox.org/git/20181115005546.212538-1-sxenos@xxxxxxxxxx/

This doc is quite well-thought-out and surprisingly readable despite its
length. That said, it is a lot to review in one sitting, and a reviewer
might get easily fatigued. I suspect that reviewers will find it hard to
keep up with the discussion if they have to review the entire doc on
every iteration.

As Victoria suggested in Review Club, it might be helpful to split up
the design over multiple patches to make feedback more focused. I think
this will make it easier for you (and others) to get a sense of how we
feel about each part of the design. e.g. here's one way to split up the
doc:

- Motivation, Background, High level idea of how a user would use this. 

  (Roughly corresponding to the sections "Objective", "Status",
  "Background", "Goals", "Similar technologies", "Semi-related work")

- Local change tracking, Changes to existing commands, Meta-commits

  (The parts about the data format and their implications for GC,
  negotiation, etc. Maybe include the `change` subcommand if it helps
  reviewers visualize the impact.)

- How evolve works, e.g. convergence, divergence, merge base finding.
  CLI

- Sharing changes

Besides the design, here other sections that I would find useful:

- Glossary. I thought that terms like "change", "change branch" and
  "change graph" were underdefined. This would also be a useful
  reference during the implementation phase.

- Implementation Plan (you can find examples in
  Documentation/technical/bundle-uri.txt and
  Documentation/technical/sparse-index.txt). Making the concrete next
  steps visible has numerous benefits:
  - Reviewers of future patches know what problem is being tackled and
    value is being delivered.
  - The list gains confidence that the author can deliver the work being
    promised.
  - The shared direction makes it easier for others to contribute
    patches.

- Open questions (e.g. "Implementation questions" in [1]). It would be
  useful to know what questions can be answered later instead of right
  now. Also, since you are not the original author, perhaps you also
  have questions about the design that you want answered by reviewers.
  I also wouldn't mind this being in the cover letter or "---" section.

[1] https://lore.kernel.org/git/pull.1367.git.1664064588846.gitgitgadget@xxxxxxxxx

As mentioned earlier, I'll comment only very lightly on the design.

>> +Similar technologies
>> +--------------------

I'd personally love to see "git evolve". If it helps to consider some
other tools, I use the following tools that implement similar workflows:

- git-branchless [2] features anonymous heads, obsolescence tracking, 
  history manipulations and "git evolve". Having used this for a while,
  I'm of the opnion that having any of these features without the
  others is still very useful, and implementing them in phases 
  will still deliver value without having to complete all of the work
  (granted, each of these features is incrementally dependent on the
  others).

  Case in point: I don't use the "evolve" equivalent of git-branchless
  (IIRC "restack); being able to see obsolescence and manually
  manipulating history is good enough for me.

- Jujutsu [3] also features anonymous heads, obsolescence tracking and
  advanced history manipulations. Instead of "evolve", descendents of an
  obsolete commit are automatically rebased on the obsoleting commit.

[2] https://github.com/arxanas/git-branchless
[3] https://github.com/martinvonz/jj

>> +Changes
>> +-------
>> +A branch of meta-commits describes how a commit was produced and what previous
>> +commits it is based on. It is also an identifier for a thing the user is
>> +currently working on. We refer to such a meta-branch as a change.
>> +
>> +Local changes are stored in the new refs/metas namespace. Remote changes are
>> +stored in the refs/remote/<remotename>/metas namespace.

I find this terminology of "changes" and "metas" more confusing than
necessary. A glossary would help, but it might be even better to also
use an appropriate ref namespace. "refs/changes/" is an obvious
candidate, though I assume this wasn't mentioned because Gerrit uses
that namespace extensively.

Maybe `refs/changelists`, `refs/change-requests`, `refs/proposals`? Idk.

>> +Sharing changes
>> +---------------
>> +Change histories are shared by pushing or fetching meta-commits and change
>> +branches. This provides users with a lot of control of what to share and
>> +repository implementations with control over what to retain.
>> +
>> +Users that only want to share the content of a commit can do so by pushing the
>> +commit itself as they currently would. Users that want to share an edit history
>> +for the commit can push its change, which would point to a meta-commit rather
>> +than the commit itself if there is any history to share. Note that multiple
>> +changes can refer to the same commits, so it’s possible to construct and push a
>> +different history for the same commit in order to remove sensitive or irrelevant
>> +intermediate states.

I would not like to see the ability to share all intermediate states
with the server because this increases the risk of unintentional
disclosure by a lot.

How exactly we could tweak this can be an open discussion for later.
Some examples I can think of:
  - Asking the user to go through the obsolescence log and manually
    prune revisions (sounds too onerous for users IMO).
  - Push a truncated history consisting of only the latest version and
    commits that the server already knows (somewhat similar to Gerrit).

>> +Evolve
>> +------
>> +The evolve command performs the correct sequence of rebases such that no change
>> +has an obsolete parent. The syntax looks like this:
>> +
>> +git evolve [upstream…]
>> +
>> +It takes an optional list of upstream branches. All changes whose parent shows
>> +up in the history of one of the upstream branches will be rebased onto the
>> +upstream branch before resolving obsolete parents.
>> +

This CLI is an example of something that can be reviewed largely
independently of the implementing data structures.

>> +Merge
>> +-----
>> +
>> +To select between these two behaviors, merge gets new “--amend” and “--noamend”
>> +options which select between the “create” and “modify” behaviors respectively,
>> +with noamend being the default.

Ditto.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux