On Sat, Jan 24, 2009 at 10:39:46AM -0800, Chad Dombrova wrote: >> I think Tim Ansell (cced) was talking about this at the gittogether >> (storing the metadata seperately), as it would benefit sparse/narrow >> checkout, another advantage supporting his case? > > what's the case against it, other than the obvious, that it will take > more work? I'm not sure this is actually the same as Tim's proposal. Tim wanted to store the commit and tree information separately from the blob information (since his use case was that blobs are enormous, but the rest is reasonable). AIUI, Chad's proposal is about storing the actual blob data itself separate from the blob object's metadata (i.e., its object type and length headers). Which means that the normal loose object format is not acceptable, and you would end up with something like (for example): .git/objects/pack/pack-full-of-your-regular-stuff.{pack,idx} .git/objects/[0-9a-f]{2}/[0-9a-f]{38}/header .git/objects/[0-9a-f]{2}/[0-9a-f]{38}/data or something similar. Then you could hardlink directly to the 'data' portion. So you would need: - to teach everything that ever looks for loose objects how to read this new format. In theory, it's all nicely encapsulated in sha1_file.c - to teach checkout routines to hardlink such a case instead of copying the file The obvious downsides that I can think of are: - it has the potential to make object reading, which is a core part of git (read: very performance- and correctness- sensitive) a lot more complex. But maybe the implementation would not be that painful; somebody would have to look very closely to see. - it interacts badly with smudge/clean filters and crlf conversion. In those cases you can't hardlink. If you treat this like an optimization, though, it's not so bad: we only do the optimization when we _can_, and fall back to regular checkout if those other options are in effect. - it's somewhat dangerous to your repository's health. Git's model is that object files are immutable (since they are, after all, named after their contents). But now you are linking them into your working tree, which makes them susceptible to some third party tool munging them. So yes, most tools will probably behave, but any tool that misbehaves will actually corrupt your repository. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html