Re: [PATCH] submodule recursion in git-archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 26 Nov 2013, at 14:18, Junio C Hamano <gitster@xxxxxxxxx> wrote:

> René Scharfe <l.s.r@xxxxxx> writes:
> 
>> Thanks for the patches!  Please send only one per message (the second
>> one as a reply to the first one, or both as replies to a cover letter),
>> though -- that makes commenting on them much easier.
>> 
>> Side note: Documentation/SubmittingPatches doesn't mention that (yet),
>> AFAICS.
> 
> OK, how about doing this then?
> 
> Documentation/SubmittingPatches | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
> index 7055576..304b3c0 100644
> --- a/Documentation/SubmittingPatches
> +++ b/Documentation/SubmittingPatches
> @@ -140,7 +140,12 @@ comment on the changes you are submitting.  It is important for
> a developer to be able to "quote" your changes, using standard
> e-mail tools, so that they may comment on specific portions of
> your code.  For this reason, all patches should be submitted
> -"inline".  If your log message (including your name on the
> +"inline".  A patch series that consists of N commits is sent as N
> +separate e-mail messages, or a cover letter message (see below) with
> +N separate e-mail messages, each being a response to the cover
> +letter.
> +
> +If your log message (including your name on the
> Signed-off-by line) is not writable in ASCII, make sure that
> you send off a message in the correct encoding.
> 
> 
>>> The feature is disabled for remote repositories as
>>> the git_work_tree fails. This is a possible future
>>> enhancement.
>> 
>> Hmm, curious.  Why does it fail?  I guess that happens with bare
>> repositories, only, right?  (Which are the most likely kind of remote
>> repos to encounter, of course.)
> 
> Yeah, I do not think of a reason why it should fail in a bare
> repository, either. "git archive" is about writing out the contents
> of an already recorded tree, so there shouldn't be a reason to even
> call get_git_work_tree() in the first place.
> 
See below for a discussion of why I use the .git file in the work tree to 
load the objects for the submodule. I also thought it should work in a
remote repository - but I ran it on a properly initialized remote repository and
it failed. Since I didn’t need it for my immediate use-case I just decided to disable 
it with an error. I can look into this further, but we must decide about the question 
below first…

> Even if the code is run inside a repository with a working tree,
> when producing a tarball out of an ancient commit that had a
> submodule not at its current location, --recurse-submodules option
> should do the right thing, so asking for working tree location of
> that submodule to find its repository is wrong, I think.  It may
> happen to find one if the archived revision is close enough to what
> is currently checked out, but that may not necessarily be the case.
> 
> At that point when the code discovers an S_ISGITLINK entry, it
> should have both a pathname to the submodule relative to the
> toplevel and the commit object name bound to that submodule
> location.  What it should do, when it does not find the repository
> at the given path (maybe because there is no working tree, or the
> sudmodule directory has moved over time) is roughly:
> 
> - Read from .gitmodules at the top-level from the tree it is
>   creating the tarball out of;
> 
> - Find "submodule.$name.path" entry that records that path to the
>   submodule; and then
> 
> - Using that $name, find the stashed-away location of the submodule
>   repository in $GIT_DIR/modules/$name.
> 
> or something like that.
> 
> This is a related tangent, but when used in a repository that people
> often use as their remote, the repository discovery may have to
> interact with the relative URL.  People often ship .gitmodules with
> 
> 	[submodule "bar"]
>        	URL = ../bar.git
> 		path = barDir
> 
> for a top-level project "foo" that can be cloned thusly:
> 
> 	git clone git://site.xz/foo.git
> 
> and host bar.git to be clonable with
> 
> 	git clone git://site.xz/bar.git barDir/
> 
> inside the working tree of the foo project.  In such a case, when
> "archive --recurse-submodules" is running, it would find the
> repository for the "bar" submodule at "../bar.git", I would think.
> 
> So this part needs a bit more thought, I am afraid.

I see that there is a lot of potential complexity around setting up a submodule:
* The .gitmodules file can be dirty (easy to flag, but should we allow archive to proceed?)
* Users can mess with settings both prior to git submodule init and before git submodule update.
* What if it’s a raw clone and the user manually changes things between init and update?
* I’m not a git-internals expert but looking through the code I see that you can add additional object
directories and change paths as you show above.

For those reasons I deliberately decided not to reproduce the above logic all by myself.
On the other hand, what it *did* seem to me is that once you have the .git file
then you know you’ve got all that covered. So I just used that. This restricts the function to
working only on a properly setup repository - but that is my use case!

If you think that doing this more extensive setup is even *viable* given the space between
init and update then I”m happy to try it. I didn’t want to start off on a fools errand.

> 
>>> 'git archive' [--format=<fmt>] [--list] [--prefix=<prefix>/] [<extra>]
>>> 	      [-o <file> | --output=<file>] [--worktree-attributes]
>>> +	      [--recursive|--recurse-submodules]
>> 
>> I'd expect git archive --recurse to add subdirectories and their
>> contents, which it does right now, and --no-recurse to only archive the
>> specified objects, which is not implemented.  IAW: I wouldn't normally
>> associate an option with that name with submodules.  Would
>> --recurse-submodules alone suffice?
> 
> Jens already commented on this, and I agree that --recursive should
> be dropped from this patch.
I only put —recursive because that is what git-clone has for it’s behaviour wrt submodules.
If that flag is deprecated then I’m fine with using only —recurse-submodules
Perhaps a deprecation flag or note in the code would help?


Overall I’m impressed by the speed and quality of the responses (and the codebase!) so am glad to
move this forward. I look forward to your feedback.

Kind Regards
Nick

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]