Re: [RFC PATCH 0/3] Support for tail (branch point) experiment

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 10, 2023 at 6:04 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> Felipe Contreras <felipe.contreras@xxxxxxxxx> writes:
>
> > This is *not* meant a serious proposal, it's just an exploration of an
> > idea.
>
> It is easy to explain and understand the benefit of keeping a
> separate pointer to the bottom [*] of the branch on top of which the
> history leading to the commit at the tip of the branch has been
> built, but the devil is in the details of how such a bottom pointer
> will be maintained.
>
>     side note: below, I use "bottom" because for me it is the most
>     natural term to refer to the starting end of the range of
>     commits.  In the context of this topic, readers can replace any
>     "bottom" they see with "tail", if they prefer.

Perhaps @{base} would be better (I think that was my original name).
Mercurial has an experimental feature called "topics", and that's the
name they use for the starting point of a topic.

> In a sense, this is very similar to the idea of "notes".  It is easy
> to explain and understand that a bag of objects, in which additional
> data can be associated with an object name, can be used to keep
> track of extra data on commits (and other objects) after they are
> created without invalidating their object name.  As long as they are
> copied/moved when a commit is used to create another copy of it.
> The "notes" are automatically copied across "rebasing", which is one
> of the many details that makes the "notes" usable, but cherry-pick
> that does not honor notes.rewriteRef sometimes leads to frustration.

I implemented that in 2014 [1].

There's no actual reason for that to not work in 2023 if we wanted.

But this is an argument in favor of @{base} (or whatever): even if
notes are not perfect, they still can be useful in certain situations,
and it's certainly better than not having that information. Similarly,
@{base} doesn't have to be perfect in the first iteration, the natural
points in which it's updated can be implemented later, by just
existing it would provide some potentially useful information to the
user, which is better than nothing.

> Creation of a new branch with "git branch" would be an obvious point
> to add such a bottom pointer, and "git rebase" is a good point to
> update such a bottom pointer.  But there are many other ways that
> people update their branches, depending on the workflow, and
> guessing when to update the bottom pointer and trying to be complete
> with the heuristics will lead to the same "no, we do not know all
> users' workflows" that made approaches based on reflog parsing
> etc. fail to solve the "where did the branch start?" puzzle.
>
> And I think what is sketched in these RFC patches can be a good
> starting point for a solution that strikes a good balance.  "git
> rebase", which is the most common way to mangle branches, is taught
> to update the bottom pointer automatically.
>
> Giving users an explicit way to set the bottom when manipulating
> branches would help those who mangle their branches with something
> other than "git rebase" in the most trivial form.  I suspect that is
> still missing in this RFC?

Yes, we would want a way to update the base manually, just like with
@{upstream}.

> Of course other things on the consuming side may be missing, like
> send-email or format-patch, but they are a lot more trivial to add and
> will be useful.  As long as the bottom pointer is properly maintained,
> that is.

Yes, but that can be done later. If @{base} is useful and updated in a
good enough manner, users are obviously going to want it used in tools
like `git send-email`, but even before that, just being able to do
`@{base}..` is useful (even if manually).

> A few of the things that I often do to mangle my branches are
> listed.  Some of them are not application of "git rebase" in the
> trivial form:
>
>  * I have a patch series (single strand of pearls).  I update on
>    top of the updated upstream:
>
>     $ git rebase -i --onto master @{bottom}
>     $ git range-diff @{bottom}@{1}..@{1} @{bottom}..HEAD
>
>    No, this is not what "I often do" yet, but I hope to see become
>    doable.  Rebase the current branch from its bottom on top of the
>    master, and then take the range diff between the old branch
>    (i.e. @{bottom} refers to the bottom pointer, but because it is
>    implemented as a ref, its reflog knows what the previous value of
>    it was---@{bottom}@{1}..@{1} would be the range of commits on the
>    branch before I did the above rebase) and the new one.

That would work only if the last update was a rebase. To make it work
reliably we would need some sort of branchlog.

Personally I have a similar use case, but I want to use range-diff
mainly before sending a patch series. What my tool `git send-series`
does is store for example `refs/sent/test-aggregate/v2` and
`refs/sent/test-aggregate/v2-tail`. Conceptually this is v2 of the
patch series.

>  * I have 7 patch series (single strand of pearls).  I only need to
>    touch the top 3.
>
>     $ git rebase -i HEAD~3
>     $ git range-diff @{1}...
>
>    In this case, I am not updating the bottom to HEAD~3 and reducing
>    the branch into 3-patch series.  I am keeping the bottom of the
>    branch, and the commits that happen to be updated are only the
>    topmost 3.

Right, maybe the base should be updated only when --onto is supplied,
or perhaps even a new --base option so it's clear the user wants the
new behavior.

>  * In the same situation, but the top 3 in the original are so bad
>    that I am better off redoing them from scratch, taking advantage
>    of new features in 'master'.
>
>     $ git checkout --detach master
>     ... work on detached HEAD ...
>     ... first pick the bottom commits ...
>     $ git cherry-pick master..@{-1}~3
>     ... still working on detached HEAD ...
>     ... redo the topmost commits from scratch ...
>     $ git range-diff master..@{-1} master..
>     $ git checkout -B @{-1}
>
>    I do not mind "checkout -B" *not* learning any trick to
>    automatically update the bottom pointer for the branch to
>    'master' in this case, but I should be able to manually update
>    the bottom of the branch easily.  Something like "git checkout -B
>    @{-1} --set-bottom=master" might be acceptable here.

Yes, something like that would be needed.

One obvious use case for me is "show me the current branch", as in
`git log @{base}..@`. Because `git log` is very efficient that's
usually not necessary, but I often launch `gitk`, and it's annoying
that it tried to load *all* the commits reachable, wasting resources
and polluting the view, which is why I started developing a tool that
essentially did `gitk $1@{u}..$1`, but that quickly becomes complex if
upstream isn't configured. With my tool I can do `git vs` (show the
current branch visually), or `git ls` (show the current branch on the
command line).

Weirdly enough, Mercurial's new topic extension has a command that
shows precisely that `hg stack` shows only the commits on the current
topic (starting from a base).

And this reminds me of the previous discussion: What actually is a branch? [2]

If we can agree that `branch@{base}..branch` semantically is
*something* (whatever you want to call it), then it might make sense
to have a way to refer to it, for example `branch^b` or `branch+`.

Then interesting combinations immediately become obvious, for example your:

    git range-diff @{bottom}@{1}..@{1} @@{bottom}..@

Becomes:

    git range-diff @{1}+ @+

Then if we expand that we can see that @{base} should be an operation
on @{1} (@{1}@{base}), not the other way around.

> IOW, I do not mind if maintenance of the bottom of the branch is not
> always automatic (and prone to heuristic making an incorrect guess).
> But I think we should make sure it is easy for the user to assist
> the tool to maintain it correctly [*].
>
>     Side note: and that is what I find "frustrating" in the "notes"
>     world.  "notes" can be copied after cherry-pick manually, but
>     that is a very tedious process, and at some point, being "merely
>     possible" stops to have much value, unless it is "easily
>     doable".

Agreed. Similarly, I did not start to use @{upstream} until it was easy to use.

But again: @{upstream} was not easy to use at the start, and @{base}
doesn't have to be either.

I think the important thing to not forget is that this is useful
information, and many would argue git is missing it.

Cheers.

[1] https://lore.kernel.org/git/1398307491-21314-13-git-send-email-felipe.contreras@xxxxxxxxx/
[2] https://lore.kernel.org/git/60e61bbd7a37d_3030aa2081a@natae.notmuch/

-- 
Felipe Contreras




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux