Clone fails on a repo with too many heads/tags

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I recently tried cloning a fresh copy of a large repo (converted from CVS, 
nearly 10 years of history) and to my surprise "git clone" failed with the 
following message:

    error: cannot spawn git: No such file or directory

The problem is only reproduced using the Smart HTTP transport.

I used msysGit on Windows so my first instinct was to contact them, but after 
some poking around I discovered that the problem is present in the Linux 
version too, although harder to trigger.

Try executing this script:

-------------------------------
git init too-many-refs
cd too-many-refs
echo bla > bla.txt
git add .
git commit -m test
sha=$(git rev-parse HEAD)
for ((i=0; i<100000; i++)); do 
	echo $sha refs/tags/artificially-long-tag-name-to-more-easily-
demonstrate-the-problem-$i >> .git/packed-refs
done
-------------------------------

Now share this repo using the Smart HTTP transport (git-http-backend) and then 
try cloning it in a different directory. This is what you would get:

$ git clone http://localhost/.../too-many-refs/.git
Cloning into 'too-many-refs'...
fatal: cannot exec 'fetch-pack': Argument list too long

So we come to the real reason for the failure: somewhere inside Git a 
subcommand is invoked with all the tags/heads on the command line and if you 
have enough of them it overflows the command line length limit of the OS.

Obviously the number of tags in the "too-many-refs" repo above is absurd (100k) 
because the cmdline length in Linux is much more generous, but on Windows the 
clone fails with as little as 500 tags in the above loop! I am already hitting 
this problem with msysGit on real repos, not just artificial test cases.

I tracked down the problem to remote-curl.c:fetch_git(). That's where the 
"fetch-pack" command line is being constructed with all the refs on one line:

git fetch-pack --stateless-rpc --lock-pack ...<all the refs>...

The solution is conceptually simple: if the list of refs results in a too long 
command line, split the refs in batches and call fetch-pack multiple times such 
that each call is under the cmdline limit:

git fetch-pack --stateless-rpc --lock-pack ...<first batch of refs>...
git fetch-pack --stateless-rpc --lock-pack ...<second batch of refs>...
...
git fetch-pack --stateless-rpc --lock-pack ...<last batch of refs>...


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]