Re: [PATCH] t4202 (log): add failing test for log with subtree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano wrote:
> It sounds like you are repeating the same old "let's record renames
> in the commit", and in a system (not Git) where recording renames
> may make sense, you may be making sense.
>
> But we will not record renames in the commit.  Time to re-read
> $gmane/217, perhaps?

Yeah, you're right.  It makes no sense to record renames, although I
have a different argument against it: any implementation that records
renames will depend on the path that was taken to get to the final
state, and this is completely wrong.  Subversion made this mistake,
and users pay a very heavy price: if the user didn't explicitly rename
a node (~= tree) and just did a delete + add, the repository is
more-or-less screwed wrt merging.

Forget everything that I said, and let's start over.

In the common usecase of subtrees, it might make sense to record some
additional submodule-like parameters in the subtree's tree object.
This additional information is entirely optional, and doesn't change
the way merges happen: we can have it as merely a heuristic-helper
(for git merge -s subtree).  Initially, let's just think of putting a
ref field in the tree, so that I can have the following setup:

- remote origin referring to my superproject.
- remote git/origin referring to the fetchdefault of my subproject git.
- remote git/ram referring to the pushdefault of my subproject.
- the tree object corresponding to quux/baz/git is additionally filled
in with the ref refs/remotes/git/origin/master.

Then, I can just say git pull quux/baz/git, and it will automatically
fetch changes from the ref and merge it into the subtree.  It's not to
say that I can't merge any other ref; just that I merge this ref most
of the time, and I want a DWIM for this case.

Further, this can speed up tree-rename detection greatly (in fact, I'm
thinking the first implementation will depend on this information
being present).  I inspect M^{tree} and I want to know how it's
composed from M^1^{tree} and M^2^{tree}.  Simple.  In M^{tree}, look
for trees that have this additional data filled in: then we can just
short-circuit the rename detection to matching the similarity of this
tree with M^1^{tree} and M^2^{tree}.

When this aux data is present in the tree, we can do one more thing:
have a symref tracking the commit-line corresponding to M^2. This
means that we can DWIM for things like 'git commit' inside the subtree
very easily.  When the aux information is not present, 'git commit'
will obviously commit to HEAD, essentially making the superproject own
those changes in the subtree (as it does now).

This might be the route to implementing narrow clones sensibly.  A
narrow clone does not mean that any directory in the entire repository
can be cloned separately: it just means that a tree with this aux data
need not be fetched in the initial clone.  For this to work, instead
of refs/remotes/git/origin/master, the aux data will need a
combination of upstream URL and ref: we can then automatically figure
out the name from the URL and deposit the fetched data in
refs/remotes/<name>/origin/<ref>.

Initializing nested submodules without the container is also easy: in
the superproject, you need to have aux-trees corresponding to
quux/baz/git and quux/baz/git/moo/clayoven.  It might additionally be
a good idea to track these aux-trees: but even if we don't go down
that route, we can always deposit a "template" file from the
superproject that won't interfere with the subtree merging process.

Now, I'm not sure what the value of splitting the object stores at
this point is.  The aux-tree can have a statthrough field to block
stat() calls from going through, so there's really no performance
issue.  If you want to separate one tree out of the superproject and
work on it separately, all you have to do is fetch the corresponding
ref.

On the issue of floating submodules.  It might make sense to zero out
the hex of the tree, as seen by the superproject.  The limitation is
that we can't introduce any changes to the submodule from the
superproject: it's basically a dummy entry sitting in the
superproject's tree.  It's a bit of a hack, but I think it's workable.

So, what do you think?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]