Re: Avery Pennarun's git-subtree?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 27, 2010 at 4:25 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Avery Pennarun <apenwarr@xxxxxxxxx> writes:
>> I agree completely.  The major failing of git-subtree is that it uses
>> tree->tree links instead of tree->commit links.
>>
>> This was necessary only because git fundamentally *mistreats* tree->commit
>> links: it refuses to push or fetch through them automatically.
>
> I do not think that is so "fundamental" as you seem to think.
>
> Isn't it just the matter of how the default UI of object transfer commands
> (like push and fetch) are set up?

Well, I call it fundamental because there's currently no way to get
the git UI to do otherwise.  It's not really just a "default."  To
depend on this changing would have prevented me from writing
git-subtree, which is why I didn't depend on it.  However, I agree
that it's fixable.

Note that the way git treats a checked-out submodule (as you describe
below) is also very fundamental to how this works.  git-subtree
wouldn't have the usability that it does if 'git checkout branchname'
didn't work perfectly will all the subtrees, which it currently does,
but which it wouldn't if I had relied on tree->commit links.

> Some "recursive" operations have been added to commands for which it makes
> sense (e.g. "clone --recursive") by people who cared enough.  Even though
> there are a few other commands that shouldn't ever learn the recursive
> mode (e.g. "commit --recursive -m $msg" would not make sense), there still
> are some commands where a similar "--recursive" option would make sense
> but haven't learned it (e.g. "push --recursive").

One problem with this line of reasoning is that "--recursive" is
always an option.  But if submodules are ever to be easy to use, I
think it should be the default (or settable as a default using git
config).  This would take us a *long* way towards usability (of
course, in addition to adding the missing features, as you mention).

Also, I haven't tried it, but I think 'git gc' will prune away objects
if the only reference to them is a 'commit' link from a tree.  This
would be undesirable too.

> I also consider it merely a lack of UI enhancement that you have to clone
> the submodule again (or cannot switch to a clean slate very easily) when
> switching between revisions of superproject before and after you add a
> submodule, and nothing fundamental.

I mostly agree with this.  There is one problem I don't know how to
solve with this idea, though: what happens when commit A adds a
submodule in modules/mod1, commit B removes it, and then commit C
re-adds the same submodules in modules/mod1-again?  Will it reuse the
same submodule .git directory or a new one?  Share objects or not?
Share branch names or not?  Share .git/config or not?

Unless you have some kind of "unique id" scheme for submodules, this
gets impossible to handle correctly.  And the git objects themselves
(trees that link to commits) have nowhere to put such things.

By comparison, simply putting all the stuff related to all the
submodules into the supermodule's repo creates none of these confusing
problems.  You could even still choose not to checkout individual
submodules' trees if you wanted.

> When switching back in history to lose a recent submodule, the user
> experience should be like switching to a revision that didn't have a
> directory.  You shouldn't be able to lose your change in that directory,
> but if the directory is clean, you should be able to lose it.  And when
> you switch to a more recent revision that has the submodule, you should be
> able to get it back (again, if you have a precious file there, the
> checkout should barf).

It sounds like you're proposing that we delete the entire submodule's
directory hierarchy when the submodule commit link goes away.  Note
that this isn't what happens in the non-submodule case: all the *.o
files, for example, in a deleted subdirectory are not automatically
deleted by git.  And I think this is the behaviour we should expect.

With that in mind, the situations where checkout barfs because of a
"precious" file should be the same as they are in normal git: it
should only be a problem if the files in question differ between the
originally-checked-out tree and the newly-checked-out tree.

Apologies if that's what you meant in the first place.

> We have added support for having "gitdir: $dir" in a regular file .git
> exactly because we wanted to be able to stash away the submodule's .git
> directory somewhere inside .git (e.g. .git/modules/<submodulename>) in the
> superproject when we do that kind of branch switching, so that we can get
> it back when switching back to a revision with the submodule without
> having to re-clone (also this presumably would help when you move the
> submodule in the superproject tree), but there haven't been further work
> to make use of this in "git submodule update" (it probably needs to start
> by teaching "git clone" how to make use of "gitdir: $dir", if anybody is
> interested).

I guess the real question is: just how much of a "real" repository do
we want a submodule to act like?

Thoughts:

- object store: I think this should just always be shared with the
superproject.  There's no reason to separate them that I can see.

- branches: should be a way to simply not worry about branches and
just use what's in the superproject.  Other people seem to want to be
able to have a set of branches/tags for their submodule.

- .git/config: entirely shared?  entirely separate?

- remotes: I would want my submodules to never do their own
pushing/pulling, and leave that to the supermodule; other people seem
to disagree.

For the particular model I'm proposing, I'm just not sure that *any*
of the features of a separate repo are warranted... and having them
adds a lot of complication.  (In the most basic level, you suddenly
need to track .git directories as submodules are added/deleted/moved
around when you checkout different revisions of the superproject, and
there seems to be no way to do that elegantly.)

Have fun,

Avery
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]