On Mon, Apr 30, 2012 at 9:00 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: >> * The prefix compression. Thomas is not using this idea; we've been >> toying with making the index bisectable (within each directory) for >> fast single-entry lookups, which inherently conflicts with this. The >> directory-like layout partially achieves the same (elides common path >> components). >> >> * The varint encoding (or offset encoding, but "varint" is something you >> can google :-). David suggested using it on stat() data, combined >> with zigzag encoding and delta against the first entry in the >> directory, which gives some good compression results. Profiling will >> have to say whether the extra decoding effort is worth the space >> savings. >> >> * The lack of variable padding, which is a good idea -- in any case I >> seem to remember Shawn complaining about it. I complain about a lot of things. Here is another... > I am planning to merge this series early to 'master', before the GSoC > student really starts working on the code, perhaps by this Wednesday. The > earlier parts of this series refactor code to make things easier to > modify, and the later parts of it demonstrate by example both: I think this is a bad idea. For sake of argument, lets say the GSoC project goes really well, and the student creates a great implementation of (what is now) index v5. Lets say we all agree its a great evolution of the format, the implementation is sane, and there is no reason not to merge it and make it the default. If this v4 thing merges to master and you make a release from master, we are potentially stuck supporting this new v4 format for the next 2 years, along with v5 which we want to immediately replace it. If any OS distro picks up a release Git that supports v4 but not v5, and parks it into their stable tree, the rest of the Git ecosystem (e.g. libgit2, JGit) will be supporting v4 until that OS distro release dies and all of its users are able to move to a newer distro with a newer Git version. Consider my case at $DAY_JOB where we still have Git 1.7.7.3 as the standard Git. Upstream has already shipped 1.7.10 and is well on its way to 1.7.11, but the distro choose to freeze on 1.7.7 rather arbitrarily because that was the latest stable release version at the time the distro was freezing its package sets for its own release. Yay. IMHO, keep this in next to avoid releasing it until we know the outcome of the GSoC project. The handful of WebKit developers that use Git that really benefit from index v4 can use it by building and installing their own next. If they can't work `make install prefix=$HOME/git`, they might want to reconsider their career and hobby activities. And we can be sure it won't show up in a distro release, thereby avoiding us needing us to support what may turn out to be a dead-end index v4. The GSoC student can build on this topic until their own work arrives in your tree. Its only a few months to wait and see where "v5" goes. If v5 is successful, v4 will just be a minor footnote in the history of Git, and other tools won't need to support v4, they can go straight to v5. If v5 fails and we choose to ship and commit to supporting v4, its only a few months delay. We have had index v2/v3 for years. We (and our users) can wait a couple of additional months for a format we can support. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html