Re: Errors cloning large repo

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Fri, 9 Mar 2007 16:54:44 -0800 (PST)

On Fri, 9 Mar 2007, Anton Tropashko wrote:
>
> > So you might be able to do just do
> > 
> >     git add dir1
> >     git add dir2
> >     git add dir3
> >     ..
> >     git commit
> > 
> > or something.
>
> For some reason git add . swallowed the whole thing
> but git commit did not and I had to split it up. I trimmed the tree a bit
> since then by removing c & c++ files ;-)

Ok, that's a bit surprising, since "git commit" actually should do less 
than "git add .", but it's entirely possible that just the status message 
generation ends up doing strange things for a repository with that many 
files in it.

I should try it out with some made-up auto-generated directory setup, but 
I'm not sure I have the energy to do it ;)

> > But one caveat: git may not be the right tool for the job. May I inquire 
> > what the heck you're doing? We may be able to fix git even for your kinds 
>
> I dumped a rather large SDK into it. Headers, libraries
> event crs.o from the toolchains that are part of SDK. The idea is to keep
> SDK versioned and being able to pull an arbitrary version once tagged.

Ok. Assuming most of this doesn't change very often (ie the crs.o files 
aren't actually *generated*, but come from some external thing), git 
should do well enough once it's past the original hump.

So your usage scheanrio doesn't sound insane, and it's something we should 
be able to support well enough. 

> > So I'm not saying that git won't work for you, I'm just warning that the 
> > whole model of operation may or may not actually match what you want to 
> > do. Do you really want to track that 8.5GB as *one* entity?
>
> Yes. It would be nice if I won't have to prune pdfs, txts, and who
> knows what else people put in there just to reduce the size.

Sure. 8.5GB is absolutely huge, and clearly you're hitting some problems 
here, but if we're talking things like having a whole development 
environment with big manuals etc, it might be a perfectly valid usage 
schenario.

That said, it might also be a good idea (regardless of anything else) to 
split things up, if only because it's quite possible that not everybody is 
interested in having *everything*. Forcing people to work with a 8.5GB 
repository when they might not care about it all could be a bad idea.

> >  - the file size is bigger than MAX_NON_LFS (2GB-1), and we don't use 
> >    O_LARGEFILE.
>
> Ok. I think you're correct:
> from ulimit -a:
> ...
> file size             (blocks, -f) unlimited

Ok, then it's the 2GB limit that the OS puts on you unless you tell it to 
use O_LARGEFILE.

Which is just as well, since the normal git pack-files won't index past 
that size *anyway* (ok, so it should index all the way up to 4GB, but it's 
close enough..)

> Good to know developers are ahead of the users.

Well, not "ahead enough" apparently ;)

I was seriously hoping that we could keep off the 64-bit issues for a bit 
longer, since the biggest real archive (firefox) we've seen so far was 
barely over half a gigabyte.

> Is there way to get rid of pending (uncommitted) changes?

"git reset --hard" will do it for you. As will "git checkout -f", for that 
matter.

"git revert" will just undo an old commit (as you apparently already found 
out)

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html