Re: [PATCH] Teach remote machinery about remotes.default config variable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano wrote:
If it is truly only about "submodule update" then the change
seems too intrusive, especially "remotes.default" variable that
affects the way how fetch and merge works in situations that do
not involve submodules.
If it is not limited to "submodule update" but equally valid fix
to non-submodule situations, the changes to the other parts may
very well be justifiable, but that would mean your "Yes" is a
lie and instead should be "No, but these situations are helped
by these changes because...".

First, I resent the patch series last night, it now uses core.origin to avoid touching remotes.* namespace.

The changes *do* fix a nit when on a non-tracking branch. With this, fetch / merge / pull will now honor that the user said (via git clone -o frotz) "my upstream is nicknamed frotz" and not try to use origin when origin was never defined.

So, while fixing this minor aggravation wasn't my motivation, I view this as a nice side-benefit :^).

The driving issues:
1) I deal with too many servers for "origin" to be a useful nick name, and we have an agreed set of nickname / server pairings across my project.
2) Therefore, we always do git clone -o frotz  frotz.foo.bar/path_to_git.
3) Because of 2, for top-level, "origin" is not defined, tracking branches set up via git branch --track point to the correct remote, and we basically understand branch names as <nickname>/branch. In other words, we *are* aware of what server we are using.
4) git-submodule update breaks the above:
- a) it invokes git clone frotz.foo.bar/path_to_git thus defining "origin" as the nickname for frotz.foo.bar. b) it invokes bare git-fetch on a detached head, so the upstream *has* to be origin.

If your top-level repository needs to access a specific server
"frotz.foo.bar" for updates, then you would have bootstrapped
the whole thing with:

	$ git clone git://frotz.foo.bar/toplevel.git

and in that particular instance of the repository, the source
repository on frotz.foo.bar would have been known as 'origin',
right?
Nope, we did it with git clone -o frotz git://frotz.foo.bar/toplevel.git
We *never* define origin, frozt.foo.bar is *always* frotz.

 I would not object if you also gave another nickname
'frotz' to the same repository for consistency across
developers.
good. We are making (some) progress. :^)
If that is the case, I am wondering why your subprojects are not
pointing at the corresponding repository on that same
'frotz.foo.bar' machine as 'origin'.  I suspect the reason is
that .gitmodules do not say 'frotz.foo.bar' but name some other
machine.
Actually,
1) We don't use origin because we avoid having to wonder "Is frotz.foo.bar named "origin" or "frotz" on this client, and thus how do I get data from frotz? 2) I submitted the change allowing submodules to be recorded into .gitmodules with a relative url (e.g., ./path_from_parent_to_submodule) rather than an absolute, so we record the relative path only. 3) Thus, git submodule has set up the submodules to point at the parent project's default remote. However, in the parent the server is nicknamed "frotz", but now in the submodule the server is nicknamed "origin" Oops.

With my patches, parent and submodule both refer to frotz.foo.bar as frotz.

And in-tree .gitmodules can name only one URL, as it is project
global and shared by everybody.  There is no escaping it.
At least as things were designed, "git submodule init" takes URL
recorded in .gitmodules as a hint, but this is for the user to
override in .git/config in the top-level.  Maybe the UI to allow
this overriding is not easy enough to use, and your submodules
ended up pointing at wrong (from the machine's point of view)
URL as 'origin'.  And perhaps that is the root cause of this
issue?

Again, the relative-url patch was to address this so that a project that is mirrored to another server remains valid on the new server without modifying the .gitmodules in-tree. (Yes, I know you *can* modify information in a given clones .git/config, but I'm trying to avoid such manual per clone/checkout modifications where it can reasonably be done.).

Basically, I think an important (but not complete) test of the design is that

   git clone -o frotz git://frotz.foo.bar/myproject.git
   cd myproject
   git submodule init
   git submodule update

work, with origin = frotz throughout the submodules, and with the whole project correctly checked out even if the entire project was rehosted onto a different server. With relative urls and my latest patch series last night, this all works, and of course upstream can still be "origin" if that is what is desired.

While our overall project exists on many servers, mirroring is an incorrect term. Rather, only certain branches of various parts exist everywhere, many other branches are specific to a given server, so we really name branches using servername/branchname. It is this aspect of the project that causes us to be aware of the server in use, and thus makes use of "origin" as a generic upstream not useful.

I am looking at the discussion on the list archive when we
discussed the initial design of .gitmodules:

    http://thread.gmane.org/gmane.comp.version-control.git/47466/focus=47502
    http://thread.gmane.org/gmane.comp.version-control.git/47466/focus=47548
    http://thread.gmane.org/gmane.comp.version-control.git/47466/focus=47621

I do not think we are there yet, and suspect that the current
"git submodule init" does not give the user a chance to say "the
URL recorded in the in-tree .gitmodules corresponds to this URL
in this repository for administrative or network connectivity or
whatever reasons".
Maybe that is the real issue that we should be tackling. I
dunno.

Although I _think_ being able to use nickname other than
hardcoded 'origin' for fetch/merge is a good change, if my above
suspicion is correct, that change alone would not make the life
easier to people who _use_ submodules, as the need for them to
set up extra nicknames (like 'frotz') and configure the
submodule repositories to use that specific nickname instead of
'origin' would not change.


git-submodule right now supports two different layouts (urls relative to the parent, and absolute urls such that each sub-module is on an independent server). The management approaches to these are going to be different.

I also suspect there are two basic use cases here: accumulation of a number of independently managed projects vs. splitting a single major project into a number of smaller pieces to allow some decoupling, but still managing the set as a composite whole.

There may be some direct correlation of use-case and submodule layout, don't know. My project uses relative-urls, and I am managing a large project that has been split into a number of components. So, my suggestions are focused entirely upon this design and use-case, and I don't expect I am addressing the others at all. (As usual, this requires someone who needs the other model(s) to step up and drive).

For *my* uses (relative urls, single logical project):

1) There are times when the parent's branch.<name>.remote should be flowed down to all subprojects for git submodule update, of course this would require that the remote be defined for all. 2) Thus, there needs to be a way to define a new remote globally for the project, and have it be correctly interpreted by each submodule (e.g., a repeat of the relative-url dereferencing now done by submodule init, but applied later to all submodules to define a new remote). Yes, this could be accomplished by going into each submodule independently and issuing appropriate commands, but administration would be much easier given a top-level command that could recurse and "do the right thing" per sub-project.

I *suspect* that origin is a much more useful concept for the alternate construct (absolute urls, loose alliance of separately managed projects), but as I said that is not my problem so please ask folks who have that model to define what works for them.
For communication purposes, I would agree with Dscho that the
name 'origin' that names different things for different people
is wrong and using specific name 'frotz' would solve
communication issues.  But when using the repository and doing
actual work, wouldn't it be _much_ better if you can
consistently go to a repository on a random machine and always
can say 'origin' to mean the other repository this repository
usually gets new objects from (and sends its new objects to)?


(Acutally, I thought I was the one arguing that using origin when it means different things to different folks is not good. That's the root of my problems. :^) )

Anyway, I have not found any use of "origin" on my project really useful. We have to be and *are* aware of the server/branchname in use, not just the branch. Partly this is because different subgroups have different natural gathering points (we tend to exchange data via ad hoc "mob" branches on whatever server is most accessible to the particular group), and partly because some information simply cannot be allowed on some servers, but basically the more accessible a server is, the less information that server can have. I believe "origin" is really useful only when it has just one meaning, or when all values are effectively identical (e.g., you have several mirrors for load balancing, etc, but all are identical modulo mirroring delays).

OTOH, a reasonable change to the semantics of "origin" might be to have:
1) core.origin name the remote that is the "normal" upstream.
2) Reserve and allow use of the name "origin" to mean $core.origin, e.g., in shell scripts replace all references to remote "origin" with $(git config core.origin). Of course, if core.origin = origin, then no user visible change occurs.

In this way, git would not record the same remote's branches in two ways (as origin/master and as frotz/master), but rather dereference origin -> frotz and then get frotz/master. Dunno, no matter how you slice it, having more than one way to refer to the same remote is going to be confusing, and that's why we don't use origin.


Mark
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux