Jon Smirl <jonsmirl@xxxxxxxxx> wrote: > If you're going to redo the pack formats another big win for the > Mozilla pack is to convert pack internal sha1 references into file > offsets.within the pack. Doing that will take around 30MB off from the > Mozilla pack size. sha1's are not compressible so this is a direct > savings. Right now Junio's working on the index to break the 4 GiB barrier. I think Junio and Nico have already agreed to change the base SHA1 to be an offset instead; though this is an issue for the current way the base gets written out behind the delta as you need to know exactly how many bytes the delta is going to be so you can correctly compute the offset. > This might reduce memory usage too. The index is only needed to get > the initial object from the pack. Since index use is lighter it could > just be open/closed when needed. True; however when you are walking a series of commits (to produce output for `git log` for example) every time you parse a commit you need to go back to the .idx to relookup the ancestor commit(s). So you don't want to open/close the .idx file on every object; instead put the .idx file into the LRU like the .pack files are (or into their own LRU chain) and maintain some threshold on how many bytes worth of .idx is kept live. > You could also introduce a zlib dictionary object into the format and > just leave it empty for now. No. I'm not sure I'm ready to propose that as a solution for decreasing pack size. Now that my exams are over I've started working on a true dictionary based compression implementation. I want to try to get Git itself repacked under it, then try the Mozilla pack after I get my new amd64 based system built. If that's as big of space saver as we're hoping it would be then the pack format would be radically different and need to change; if it doesn't gain us anything (or is worse!) then we can go back to the drawing board and consider other pack format changes such as a zlib dictionary. But right now its measly 4% gain isn't very much. -- Shawn. -- VGER BF report: U 0.653439 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html