Re: Bug? git submodule update --reference doesn't use the referenced repository

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04-Apr-17 4:32 AM, Stefan Beller wrote:
On Sun, Apr 2, 2017 at 8:13 PM, Maxime Viargues
<maxime.viargues@xxxxxxxxxx> wrote:
Hi there,

I have been trying to use the --reference option to clone a big repository
using a local copy, but I can't manage to make it work using sub-module
update. I believe this is a bug, unless I missed something.
I am on Windows, Git 2.12.0
which is new enough, that the new --reference code is in. :)

So the problem is as follow:
- I have got a repository with multiple sub-modules, say
     main
         lib1
             sub-module1.git
         lib2
             sub-module2.git
- The original repositories are in GitHub, which makes it slow
- I have done a normal git clone of the entire repository (not bare) and put
it on a file server, say \\fileserver\ref_repo\
(Note that the problem also happens with local copy)

So if I do a clone to get the repo and all the submodules with...
git clone --reference-if-able \\fileserver\ref-repo --recursive
git@xxxxxxxxxx:company/main
...then it all works, all the sub-modules get cloned and the it's fast.
great. :)

Now in my case I am working with Jenkins jobs and I need to first do a
clone, and then get the sub-modules, but if I do...
git clone --reference-if-able \\fileserver\ref-repo
git@xxxxxxxxxx:company/main (so non-recursive)
cd main
git submodule update --init --reference \\fileserver\ref-repo
... then this takes ages, as it would normally do without the use of
--reference. I suspect it's not actually using it.
So to confirm your suspicion, can you run

   GIT_TRACE=1 git clone ...
   cd main && GIT_TRACE=1 git submodule update ...

to see which child processes are spawned to deal with the submodules?
Also to confirm, it is the "submodule update" that is taking so long for you?
Yes I confirm it's the "submodule update" which is taking a long time. The clone with the reference is definitely working.

Running git submodule update with "GIT_TRACE=1", here is a snippet of what I get:

10:14:44.924684 git.c:596 trace: exec: 'git-submodule' 'update' '--init' '--reference' '\\fileserver\Builds\reference_repos\main-repo' 10:14:44.925684 run-command.c:369 trace: run_command: 'git-submodule' 'update' '--init' '--reference' '\\fileserver\Builds\reference_repos\main-repo' 10:14:45.146488 git.c:596 trace: exec: 'git-sh-i18n--envsubst' '--variables' 'usage: $dashless $USAGE' 10:14:45.146488 run-command.c:369 trace: run_command: 'git-sh-i18n--envsubst' '--variables' 'usage: $dashless $USAGE' 10:14:45.231548 git.c:596 trace: exec: 'git-sh-i18n--envsubst' 'usage: $dashless $USAGE' 10:14:45.231548 run-command.c:369 trace: run_command: 'git-sh-i18n--envsubst' 'usage: $dashless $USAGE' 10:14:45.357059 git.c:371 trace: built-in: git 'rev-parse' '--git-dir' 10:14:45.427806 git.c:371 trace: built-in: git 'rev-parse' '--git-path' 'objects' 10:14:45.487348 git.c:371 trace: built-in: git 'rev-parse' '-q' '--git-dir' 10:14:45.593794 git.c:371 trace: built-in: git 'rev-parse' '--show-prefix' 10:14:45.643162 git.c:371 trace: built-in: git 'rev-parse' '--show-toplevel' 10:14:45.700201 git.c:371 trace: built-in: git 'submodule--helper' 'init' 10:14:45.986024 git.c:371 trace: built-in: git 'submodule--helper' 'update-clone' '--reference=\\fileserver\Builds\reference_repos\main-repo' 10:14:45.988024 run-command.c:1155 run_processes_parallel: preparing to run up to 1 tasks 10:14:45.988024 run-command.c:369 trace: run_command: 'submodule--helper' 'clone' '--path' 'lib1/lib1_source' '--name' 'lib1/lib1_source' '--url' 'git@xxxxxxxxxx:company/main-repo-lib1.git' '--reference' '\\fileserver\Builds\reference_repos\main-repo' 10:15:06.204872 run-command.c:369 trace: run_command: 'submodule--helper' 'clone' '--path' 'lib2/lib2_source' '--name' 'lib2/lib2_source' '--url' 'git@xxxxxxxxxx:company/main-repo-lib2.git' '--reference' '\\fileserver\Builds\reference_repos\main-repo' 10:14:46.025555 git.c:371 trace: built-in: git 'submodule--helper' 'clone' '--path' 'lib1/lib1_source' '--name' 'lib1/lib1_source' '--url' 'git@xxxxxxxxxx:company/main-repo-lib1.git' '--reference' '\\fileserver\Builds\reference_repos\main-repo' 10:14:46.027555 run-command.c:369 trace: run_command: 'clone' '--no-checkout' '--reference' '\\fileserver\Builds\reference_repos\main-repo' '--separate-git-dir' 'D:/tmp2/git_clone_tests/main-repo/.git/modules/lib1/lib1_source' 'git@xxxxxxxxxx:company/main-repo-lib1.git' 'D:/tmp2/git_clone_tests/main-repo/lib1/lib1_source' 10:14:46.061305 git.c:371 trace: built-in: git 'clone' '--no-checkout' '--reference' '\\fileserver\Builds\reference_repos\main-repo' '--separate-git-dir' 'D:/tmp2/git_clone_tests/main-repo/.git/modules/lib1/lib1_source' 'git@xxxxxxxxxx:company/main-repo-lib1.git' 'D:/tmp2/git_clone_tests/main-repo/lib1/lib1_source' 10:14:46.115339 run-command.c:369 trace: run_command: 'ssh' 'git@xxxxxxxxxx' 'git-upload-pack '\''company/main-repo-lib1.git'\'''
Cloning into 'D:/tmp2/git_clone_tests/main-repo/lib1/lib1_source'...
10:14:48.962590 run-command.c:369 trace: run_command: 'git-upload-pack '\''//fileserver/Builds/reference_repos/main-repo/.git'\''' 10:14:49.103908 run-command.c:369 trace: run_command: 'git-upload-pack '\''D:/GitHub/main-repo/.git'\''' 10:14:49.184477 run-command.c:369 trace: run_command: 'git-upload-pack '\''//fileserver/Builds/reference_repos/main-repo/.git'\''' 10:14:49.322365 run-command.c:369 trace: run_command: 'git-upload-pack '\''D:/GitHub/main-repo/.git'\''' 10:14:52.281044 run-command.c:369 trace: run_command: 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 5764 on WIN-1198' '--check-self-contained-and-connected' 10:14:52.315569 git.c:371 trace: built-in: git 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 5764 on WIN-1198' '--check-self-contained-and-connected' 10:15:06.119340 run-command.c:369 trace: run_command: 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' 10:15:06.170876 git.c:371 trace: built-in: git 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' 10:15:13.072336 run-command.c:369 trace: run_command: 'submodule--helper' 'clone' '--path' 'lib3/lib3_source' '--name' 'lib3/lib3_source' '--url' 'git@xxxxxxxxxx:company/main-repo-lib3' '--reference' '\\fileserver\Builds\reference_repos\main-repo' 10:15:06.238893 git.c:371 trace: built-in: git 'submodule--helper' 'clone' '--path' 'lib2/lib2_source' '--name' 'lib2/lib2_source' '--url' 'git@xxxxxxxxxx:company/main-repo-lib2.git' '--reference' '\\fileserver\Builds\reference_repos\main-repo' 10:15:06.239894 run-command.c:369 trace: run_command: 'clone' '--no-checkout' '--reference' '\\fileserver\Builds\reference_repos\main-repo' '--separate-git-dir' 'D:/tmp2/git_clone_tests/main-repo/.git/modules/lib2/lib2_source' 'git@xxxxxxxxxx:company/main-repo-lib2.git' 'D:/tmp2/git_clone_tests/main-repo/lib2/lib2_source' 10:15:06.273418 git.c:371 trace: built-in: git 'clone' '--no-checkout' '--reference' '\\fileserver\Builds\reference_repos\main-repo' '--separate-git-dir' 'D:/tmp2/git_clone_tests/main-repo/.git/modules/lib2/lib2_source' 'git@xxxxxxxxxx:company/main-repo-lib2.git' 'D:/tmp2/git_clone_tests/main-repo/lib2/lib2_source' 10:15:06.309945 run-command.c:369 trace: run_command: 'ssh' 'git@xxxxxxxxxx' 'git-upload-pack '\''company/main-repo-lib2.git'\'''
Cloning into 'D:/tmp2/git_clone_tests/main-repo/lib2/lib2_source'...
10:15:08.210491 run-command.c:369 trace: run_command: 'git-upload-pack '\''//fileserver/Builds/reference_repos/main-repo/.git'\''' 10:15:08.370561 run-command.c:369 trace: run_command: 'git-upload-pack '\''D:/GitHub/main-repo/.git'\''' 10:15:08.451234 run-command.c:369 trace: run_command: 'git-upload-pack '\''//fileserver/Builds/reference_repos/main-repo/.git'\''' 10:15:08.589129 run-command.c:369 trace: run_command: 'git-upload-pack '\''D:/GitHub/main-repo/.git'\''' 10:15:11.533328 run-command.c:369 trace: run_command: 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 9308 on WIN-1198' '--check-self-contained-and-connected' 10:15:11.575862 git.c:371 trace: built-in: git 'index-pack' '--stdin' '--fix-thin' '--keep=fetch-pack 9308 on WIN-1198' '--check-self-contained-and-connected' 10:15:12.986776 run-command.c:369 trace: run_command: 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' 10:15:13.039314 git.c:371 trace: built-in: git 'rev-list' '--objects' '--stdin' '--not' '--all' '--quiet' 10:15:49.633796 run-command.c:369 trace: run_command: 'submodule--helper' 'clone' '--path' 'lib4/lib4_source' '--name' 'lib4/lib4_source' '--url' 'git@xxxxxxxxxx:company/main-repo-lib4.git' '--reference' '\\fileserver\Builds\reference_repos\main-repo' 10:15:13.106441 git.c:371 trace: built-in: git 'submodule--helper' 'clone' '--path' 'lib3/lib3_source' '--name' 'lib3/lib3_source' '--url' 'git@xxxxxxxxxx:company/main-repo-lib3' '--reference' '\\fileserver\Builds\reference_repos\main-repo' 10:15:13.107441 run-command.c:369 trace: run_command: 'clone' '--no-checkout' '--reference' '\\fileserver\Builds\reference_repos\main-repo' '--separate-git-dir' 'D:/tmp2/git_clone_tests/main-repo/.git/modules/lib3/lib3_source' 'git@xxxxxxxxxx:company/main-repo-lib3' 'D:/tmp2/git_clone_tests/main-repo/lib3/lib3_source' 10:15:13.141464 git.c:371 trace: built-in: git 'clone' '--no-checkout' '--reference' '\\fileserver\Builds\reference_repos\main-repo' '--separate-git-dir' 'D:/tmp2/git_clone_tests/main-repo/.git/modules/lib3/lib3_source' 'git@xxxxxxxxxx:company/main-repo-lib3' 'D:/tmp2/git_clone_tests/main-repo/lib3/lib3_source' 10:15:13.174486 run-command.c:369 trace: run_command: 'ssh' 'git@xxxxxxxxxx' 'git-upload-pack '\''company/main-repo-lib3'\'''
...

The git clone documentation mentions that the reference is then passed to
the sub-module clone commands, so I would expect "git clone --recursive" to
work the same as "git submodule update", as far as --reference is concerned.
Oh, there we have an opportunity to improve the man page (or the code).

     git clone --reference --recursive ...

will set the config variables

     git config submodule.alternateLocation superproject
     git config submodule.alternateErrorStrategy die (or "info" for
--reference-if-able)

and the clone for the submodules (that are an independent process, just
run after the clone of the superproject is done) will pickup these
config variables
and act accordingly.

If you only run

     git clone --reference ...

then these variables are not set. Probably they should be set such
that the later
invocation of "git submodule update --int" will behave the same as the git-clone
of the superproject did.

So as a workaround for you to get up to speed again, you can just set
these config
variables yourself before running the "submodule update --init" and it
should work.
Ok I'll try that.
I noticed for a single module, doing a...
git submodule update --init --reference
\\fileserver\ref-repo\lib1\sub-module1 -- lib1/sub-module1
...i.e. adding the sub-module path to the reference path, works. Which kind
of make sense but then how do you do to apply it to all the sub-modules?
(without writing a script to do that)
I think that functionality is broken as it takes the same reference
for all submodules,
such that you need to go through the submodules one by one and give the
submodule specific reference location.
I actually made a script to run it on each submodule, which works but is still quite slow as it cannot be parallelized (git doesn't like multiple submodule updates running concurrently).

If someone can confirm the problem or explain me what I am dong wrong that
would be great.

Maxime
Stefan
Thanks for you quick answer

Maxime



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]