Re: [PATCH] glossary: improve "branch" definition

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <gitster@xxxxxxxxx> writes:

> Sergey Organov <sorganov@xxxxxxxxx> writes:
>
>>> But do we need to say "a separate line of development", instead of
>>> just "a line of development"?  What is "a line of development" that
>>> is not separate?  What extra pieces of information are we trying to
>>> convey by having the word "separate" there?
>>
>> I think it tries to convey a notion that 2 branches represent separate
>> lines of development. I.e., that the whole purpose of branching is to
>> provide support for independent, or parallel, or /separate/ lines of
>> development.
>
> So in the context of talking about a branch, there is no need to say
> "a separate line".  It only starts making sense to use the word
> "separate" whey you say "this is a line of development.  By the way,
> there is another line of development that is separate from the first
> one".
>
>> I'm not going to insist on the exact wording though, -- just wanted to
>> bring attention to the issue, and "separate" was somehow the first word
>> that came to mind when I edited the text.
>>
>> As an after-thought, I'd probably add that branch in Git is represented
>> by a chain of commits, and then I'd refer to most recent commit of the
>> chain, instead of most recent commit on the branch. That'd make
>> definition more formal and precise. Makes sense?
>
> It brings up a more serious issue, though.  
>
>          o---o---o---o---x A
>         /             \
>     ---o---o---o---o---o---o---y B
>
> The only thing everybody can agree on in the above history is that
> commit 'x' is at the tip of the branch A, and commit 'y' is at the
> tip of the branch B,

Yeah, sure.

> and 'y^' is on the branch B.

I'm not that sure about 'y^', sorry, even if it now has no other
references.

I'm only sure that commits not reachable as first parents from B are not
on the branch B, and that there is a chain, even if empty, from "Git
branch B" (through first parent) that constitutes particular branch in
the user domain.

> There is no good answer to questions like
>
>   where does branch 'A' begin?
>   where does branch 'B' begin?

There is: "it's undefined".

Why does it matter for definition of the term "branch"? I think it
doesn't. Glossary didn't define where branches begin, and it'd still
refuse to define it. What's the issue with that?

Where exactly given branch starts lies entirely in the user domain, not
in Git domain, so we don't need to define this in the Git glossary, I
think. We can mention why we left it undefined though, if it makes
sense.

>
> Perhaps the merge to 'B' was from another branch that no longer
> exists (because the whole 4-commit chain was merged at that point to
> the integration branch 'B'), and 'A' was forked from that branch
> whose name was forgotten.

Perhaps, but I can't see how it's relevant to the glossary. It'd be
essential if Git remembered on which branch which commit has been
created, but it (fortunately) doesn't, so it (fortunately) isn't.

> Any commit in the history represents a line of development behind
> that commit, and whether a commit is pointed at by a ref does not
> change that.

Sure. Moreover, user is free to consider this particular line of
development to be a "branch", in his vocabulary.

We do not call /every/ line of development "branch" in Git proper, or do
we? I'd say that in Git proper "branch" is not a line of development at
all, because Git doesn't care.

> And development is not even a line when you include forking and
> merging.

Development isn't. A line of development is "a line" by definition
though.

>
> In the mental model of Git about branches, I think the only one
> thing people can agree on is that a branch points at a commit, and
> checking it out and making a commit on top of it will change that
> branch to point at the newly created commit.  And this view supports
> the word "separate"---whether you have two branches pointing at the
> same commit or a different one, building a new commit on and
> advancing the tip of one branch does not affect the other branch.

So, as it does make sense, why don't we stick to "separate"?

> Come to think of it, the original "active" may not have been so bad
> a word to begin with.  It is misleading in the sense that "active"
> used in the original statement does not mean "currently checked
> out", but if we read it as "potentially active---can grow in its own
> direction", it does convey that each branch can (although does not
> have to) represent its own line of development.

I don't get it. "Potentially active" is not what is a distinction of the
"branch". Every commit is potentially active: "git checkout
<commit>" and grow new history out of it. Further, even current branch
could be inactive, so I still fail to see any reason to use "active" in
branch definition.

>
> So, I dunno.  I'd say just settling on the simplest "is a line of
> development" would be the easiest path for now.

Thinking more about it, this first phrase is entirely user-domain
entity, so we will have hard time to come up with strict definition
anyway, and "is a line of development" is as fine with me as "is a
separate line...", cause both have nothing to do with Git the program
:-)

Now, if we stay inside Git proper in the glossary, we'd need to get rid
of this first phrase and stick to what "branch" is in Git. And in Git
it's just a specific type of reference that (unlike, say, tags) follows
new commits. It's interesting that from this definition follows that we
may easily consider HEAD to be a meta-branch, that, in addition to
properties of other branches, first, defines the point in the DAG new
commits are to be grown from, and second, can point to another branch.

For what it's worth, here is description of the "branch", as I see it:

branch: /branch/ is the way to refer by name to particular line of
        development represented as a /chain/ of commits. /Branch/ is
        implemented as specific kind of /reference/, called branch
        /head/, that always points to the most recent commit of the
        branch /chain/ it gives name to. This most recent commit of the
        branch chain is referred to as the tip of the branch. Branch
        head moves forward along with the branch tip as branch chain
        grows due to addition of new commits.

        Git /repository/ can have an arbitrary number of branches, but
	your /working tree/ could be associated with at most one of them
	at any given time (the "current" or "checked out" branch), and
	special /reference/ called /HEAD/ points to the head of that
	branch in this case.

        Alternatively, /HEAD/ could point directly to a commit rather
        than to a branch head, in which case an unnamed chain will grow
        from this commit as additional commits are being made. This mode
        of operation is referred to as "detached HEAD", though for
        uniformity it could be regarded as being on "unnamed branch".
        You can still give name to this "unnamed branch" at any time thus
        turning it into yet another regular branch.


-- Sergey Organov



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux