On Tue, Dec 03, 2019 at 08:04:15AM -0800, Junio C Hamano wrote: > Jeff King <peff@xxxxxxxx> writes: > > > Good catch. The issue is that we only add entries to the hashmap in > > prepare_packed_git(), but they may be added to the pack list by other > > callers of install_packed_git(). It probably makes sense to just push > > the hashmap maintenance down into that function, like below. > > Makes sense to me. > > Let me locally squash your fix in and credit you with helped-by > footer in the amended log message. Strictly speaking, this may > invalidate the perf numbers, but I do not think the scenario p5303 > sets up alone is all that interesting anyway---if you have 10,000 > packs, not just registering them (which is improved with this patch) > but using objects from them would be slower than necessary X-<. Thanks, that sounds good. I actually re-checked the perf numbers (mostly to make sure I didn't screw anything up) and got similar results. I agree that 10,000 packs is ridiculous, but we do see it (and worse) occasionally from people pushing in a loop before our scheduled maintenance kicks in. And it's quadratic, so if you hit 30k packs, it's another factor of 9 worse. It makes even diagnosing the repository pretty painful. :) Also a fun fact: Linux actually has a limit on the number of simultaneous mmaps that a process can have, which defaults to ~64k. But if you have if you have 32k packs, then we'll map both the packs and the idx files. Plus whatever you need for mapping the binary, libraries, etc, plus any maps opened by malloc() for large requests. I have occasionally run into this trying to repack some very out-of-control cases (the magic fix is to double the limit with `sysctl -w vm.max_map_count=131060`, if you are curious). I also wondered if this might be made worse by the recent change to drop release_pack_memory(). But I ran into it even before that change, because zlib calls malloc() directly. We're also pretty aggressive about dying when mmap() returns an error (rather than closing packs and trying again). I think Git _could_ be handle this more gracefully by just trying to keep fewer packs open at one time (the way we similarly try not to use up all of the file descriptors). But I have a hard time caring too much, since it's such a ridiculous situation in the first place. Bumping the limits is an easy operational fix. -Peff