Junio C Hamano <junkio@xxxxxxx> wrote: > Step 3. Work on integrating partial mmap() support with Shawn. > This is more or less orthogonal to 4GB ceiling (people > would hit mmap() limit even with a 1.5GB pack), but I > suspect it would be necessary to be able to tell where > the end of each pack entry is cheaply to implement > this. I was just getting ready to move my partial mmap support over from fast-import. Although I did the implementation a little differently in fast-import than what I think I'll do in core Git. In fast-import store a hashtable in memory of all objects in the pack but I chose not to store the ending offset (or compressed length) and instead just guess about where the object ends. I did that to save 4 bytes of memory per object. :-) Its necessary to know where the object ends to ensure that your current mapping (or any remapping you are about to do) covers the entire object before you start deflating. Otherwise you might have to remap the pack in the middle of the inflate operation. (Of course you might need to do this anyway if the compressed object is larger than your default mapping unit.) What I did in fast-import was give inflate whatever was left in the current mapping; then if I got a Z_OK or Z_BUF_ERROR back from inflate I move the mapping to the next 128 MiB chunk and reset my z_stream's next_in/avail_in accordingly, then recall inflate. No I didn't performance test it to see how frequently I'm mapping a pack multiple times to get one object. But I'm going to stick my neck out and say that most objects probably don't have a compressed length exceeding 128 MiB so we're talking one remap that we would have had to do anyway if the object spanned over the end of the current mapping. If the object's starting offset was completely outside of the current mapping then I rounded the offset down to the page size (from getpagesize) and remapped; therefore we also probably only do one remap on objects needing it. But having the length or ending offset in the index will help with copying the object during a repack as well as prevent us from needing to guess during accesses. So good news indeed that you are adding it to the index. -- Shawn. -- VGER BF report: U 0.5 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html