Re: What actually is a branch?

Felipe Contreras <felipe.contreras@xxxxxxxxx> · Thu, 08 Jul 2021 19:45:37 -0500

Martin wrote:
> On 08/07/2021 22:37, Felipe Contreras wrote:
> > Technically a branch is a file with an object id in it. That doesn't
> > give the user any useful information.
> > 
> > What is important is the *meaning* of that file.
> > 
> >> People indeed tend to thing, I branched at X, so anything before is not
> >> part of the branch.
> >> "--contains" says otherwise.
> > 
> > Yes, that is the status quo, but the fact that X is the case doesn't
> > mean it *should* be the case.
> 
> Well yes. So lets start over.
> 
> A branch is a container for commits. Those commits have a start (root or 
> base / not sure), and an end (head).
> The commits are continuous, in that they have no gaps.
> 
> The big question is the start point of the branch.
> 
> And there is a further consequence:
> If a branch "starts" at "base" then
>   --contains  needs to be changed
>   --reachable needs to be added (for what contains does now)

Indeed, but as of this moment @{base} is not being considered, it's just
a mental model tool.

> This also complicates it, because now there are 3 types of relation 
> between commits and a branch
> - unrelated (outside / not reachable)
> - inside (base..head)
> - reachable (base and all its parents) // better word needed

I think that has always been the case. The fact that the git
documentation doesn't talk about that doesn't mean the concept doesn't
exist.

> The last is important:
> 
> A => B => C master
>       \ => D  foo
> 
> If I delete master, without the concept of reachable, I would expect 
> commit A to be dropped. Technically B should drop too, but it takes some 
> insight to expect that.
> So then with only the branch foo left, I would also have only the commit 
> D (well maybe B too, if the system is lenient)

Commits don't need a branch to exist. B could have a tag 0.3.7 and no
branch pointing to it. There could be other refs pointing to that
commit.

> One might even go an say if master is deleted, then the base of foo is 
> deleted. since foo must have a base, and it no longer has, foo can not 
> exist any longer.

Of course it can. The base of a branch doesn't necessarily need to be
part of any other branch.

Or another way to think of it is that B is part of an unnamed branch.

> > A branch that you hold, or point to, is a concete concept easy to
> > underand. When I say: "me, my sister, and my father are one tiny branch
> > of the Contreras family", people understand what that means inuitively.
> > 
> > On the other hand saying "Felipe contains his great-great-grandfather"
> > would stop anyone on their tracks.
> 
> The Chicago branch of your family contains Al Capone.
> That works.

Sure, if you start from a certain grandparent, not if you start from my
grandfather.

Most humans have issue with more than 7 items. A branch containing
millions of members reaching as far back as a fish is a notion an
evolutionary biologist might not have any problem with, but most people
would struggle.

For most people a branch must start from somewhere.

> > But if you do `git reset --hard origin/master`, you are saying: drop
> > everything about this branch, and make it the same 'origin/master'.
> > *Now* we have a reason to distinguish `git merge --ff-only` from `git
> > reset --hard`.
> 
> No you don't. IMHO not.
> "reset --hard" resets the branch to a commit. You can specify that 
> commit by giving a branch-name (that then will be resolved). But it 
> could be any commit, even a detached one.

OK. Sure. It could be repurposed to say what I explained, but we might
be overloading that command in that case.

How about `gt branch --reset <otherbranch>`?

> So "reset --hard" has to set the base and the head to the same commit. 
> Effectively creating an empty branch based at that commit.

Maybe. Or maybe the base remains the same. Fortunately that's not
something we need concern ourselves with at this moment.

> But local tracking branches still are counter intuitive.
> 
> IMHO local tracking branches should follow one of the following 
> scenarios. (And ideally that should be the same for all local tracking 
> branches, for any user.)
> 
> 1) Always have the same base as their remote branch.
> Therefore always have the same content as the remote branch, up to where 
> they diverge, if they diverge.
> 
> 2) Not include the remote branches content. Just hold my local commits, 
> until they will be pushed to the remote.
> 
> But neither works:
> 
> Say I have a local commit, and you pushed new changes to the remote.
>     git pull --rebase
> My branch is rebased.
> So my local tracking branch has its base at the head of the remote. It 
> has only local commits => case 1.
> 
> Say I have no local commits, and you pushed new changes to the remote.
>     git pull --ff-only
> If I understand correct the --ff-only move the head of my local branch, 
> but leaves the base where it is.
> Now I have some shared commits with the remote branch.
> => either case 2, or worse none of the 2 cases.

There's no need for --ff-only, do `git pull --rebase` on both cases, and
the base will constantly be reset to the remote head.

However, at least I never do this. My 'master' branch doesn't contain
any commits and I always do the equivalent of `git pull --ff-only`, so
the base would never change.

> > If you send a pull request for your 'master' branch, which then gets
> > merged to 'origin/master', then you can do `git merge --ff-only` to
> > advance the head pointer of the 'master' branch to the remote branch so
> > both are in sync... Except the base won't be the same.
> 
> There may be something I missed. ff should not touch the base?
> So the 2 base will still be the same or not the same, depending on if 
> they were equal before the ff?

That's right. Before the fast-forward the base was different (because of
the rebase), so after the fast-forward the base remains different.

> >> So yes, what is a branch? More exactly what does it contain.
> >> Two examples, that to me suggest two answers.
> > 
> > Not necessarily. See above.
> 
> I feel we must have some understandingly on the part how base and local 
> branches would interact.
> 
> You agree: rebase changes the base (it creates a new branch on to --onto)
> 
> You pointed out there also is fast-forward. But see my above example.
> I am not even doing a pull request. I simply go for you and I both can 
> push to the same remote. So we both commit to master and pull/push it.

It doesn't matter who does the merge:

  git merge origin/master
  git push

It would be the same as a pull request followed by a fast-forward
(except with the parents reversed).

The base remains unmoved.

> >> Also if branch@{base}..branch  then there is a problem.
> >> - branch@{base} is then correctly not part of the branch
> >> - So immediately after "git switch -c branch" the branch is empty => ok
> >> But if so, then what is the branch head at that time?
> >> The Pointer would point the @{base}, but @base is outside the branch. So
> >> the pointer of the branch points outside the branch?
> > 
> > Yes, the base pointer doesn't include the branch. When you do
> > `branch@{base}..branch` that's the same as `^branch@{base} branch` so that
> > excludes all the commits rechable from branch@{base} *including* that
> > commit iself.
> 
> My question is, where you see the branch head pointing to?
> If the branch is empty, i.e. if it has no commit at all, then to what 
> commit does the branch head point?

To the same commit as the base: master..master contains zero commits.

> >> The only problem is:
> >> branch is too often used for "the commits contained in the branch". That
> >> is way to common to even try to stop it.
> > 
> > We don't need to stop it, we can sidestep it.
> > 
> > Instead of talking about the branch, talk about the branch head:
> > "the brach head is moved to X".
> 
> Yes well, we need to be very concise, if we speak about anything that is 
> not the "commits in the branch".
> 
> 
> >>> When you change the branch head you are effectively changing the branch.
> >> Well if branch is the pointer, then you change the branch, and head is
> >> being changed.
> >> If branch is the content, then you change the head, and yes the content
> >> changes.
> > 
> > Exactly, so regardless of which semantics you choose, everyone
> > understands that the branch is not the same anymore.
> > 
> 
> Your original text was
> > When you change the branch head you are effectively changing the branch.
> > If @{base} existed, then changing the base would also change the branch
> > (although that would be a much less dangerous operation).
> > 
> > Does that make sense?
> 
> And yes, if either boundary changes, the branch changed.

But our immediate concern is to improve the documentation of
`git switch -C`, and perhaps improve the interface while we are at it.

I believe we have all the semantic tools needed to write something that
is understandable by most people regardless of their conception of what a
branch is.

No?

-- 
Felipe Contreras