Hi, On Fri, 15 May 2009, Linus Torvalds wrote: > On Fri, 15 May 2009, Junio C Hamano wrote: > > > Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes: > > > > > if you need a chuckle, like me, you might appreciate this story: in > > > one of my repositories, "git gc" dies with > > > > > > unable to open object pack directory: ...: Too many open files > > > > > > turns out that there are a whopping 1088 packs in that repository... > > > > Isn't it a more serious problem than a mere chuckle? How would one > > recover from such a situation (other than "mv .git/objects/pack-*; for > > p in pack-*.pack; do git unpack-objects <$p; done")? > > Well, you can probably just increase the file limits and try again. > Depending on setup, you may need root to do so, though. > > I also think you _should_ be able to avoid this by just limiting the > pack size usage. IOW, with some packed_git_limit, something like > > [core] > packedGitWindowSize = 16k > packedGitLimit = 1M > > you should hopefully be able to repack (slowly) even with a low file > descriptor limit, because of the total limit on the size. I don't think so, because the window size has nothing to do with the amount of open windows, right? > That said, I do agree that ulimit doesn't always work on all systems > (whether due to hard system limits or due to not having permission to > raise the limits), and playing games with pack limits is non-obvious. We > should really try to avoid getting into such a situation. But I think git > by default avoids it by the auto-gc, no? So you have to disable that > explicitly to get into this bad situation. No, in this case, nothing was disabled. auto-gc did not kick in, probably due to funny Git usage in hg2git. > One solution - which I think may be the right one regardless - is to not > use "mmap()" for small packs or small SHA1 files. > > mmap is great for random-access multi-use scenarios (and to avoid some > memory pressure by allowing sharing of pages), but for anything that is > just a couple of pages in size, mmap() just adds big overhead with > little upside. > > So if we use malloc+read for small things, we'd probably avoid this. Now, > if you have a few thousand _large_ packs, you'd still be screwed, but the > most likely reason for having a thousand packfiles is that you did daily > "git pull"s, and have lots and lots of packs that are pretty small. > > Dscho? What are your pack-file statistics in this case? Mostly around 50kB. But using malloc()+read() to avoid my use case sounds not straight-forward; it is rather a work-around than a proper solution. For performance, I agree that malloc()+read() might be a sensible thing in a lot of cases. Ciao, Dscho -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html