On Mon, 26 Nov 2007, Nicolas Pitre wrote: > On Mon, 26 Nov 2007, Shawn O. Pearce wrote: > > > - Loose objects storage is difficult to work with > > > > The standard loose object format of DEFLATE("$type $size\0$data") > > makes it harder to work with as you need to inflate at least > > part of the object just to see what the hell it is or how big > > its final output buffer needs to be. > > It is a bit cumbersome indeed, but I'm afraid we're really stuck with it > since every object SHA1 depends on that format. No. The SHA1 itself just depends on "$type $size\0$data" (no deflate phase), and that one is easy and cheap to calculate. How we then *encode* the data on disk is totally immaterial. In fact, pack-files obviously do not encode it in that form at all, they in fact use two different forms of "$binaryhdr$DEFLATE($data)" or "$binaryhdr$basesha$DEFLATE($delta)" (that's from memory, so don't rely on that). So we could easily change the on-disk format, and we obviously have - the alternate (but deprecated) format for unpacked objects already did. In fact, we could - and probably should - add some kind of "back end interface" for alternate encoding formats, in case somebody wants to do something really crazy like use a database for object tracking. (Side note: using an actual database would really be insane. There is absoluely zero point. But what *could* be interesting would be to have a "cluster back-end" for the git object store, where objects get hashed to different nodes. If you have a really fast network, it may actually be beneficial to spread the objects out, and get better disk throughput by that kind of strange "git object RAID-0 striping" setup) Linus (*) Honesty in advertising: the really *original* format did the SHA1 after the deflate, but that was quickly fixed and was a really stupid choice. The main point for doing that was that it meant that loose objects could be verified by just running "sha1sum" on them, and comparing the result with their name. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html