Junio C Hamano <gitster@xxxxxxxxx> writes: > Just wondering if it would help to have the third kind of object > representation in the object database, sitting next to loose objects > and packed objects, say .git/objects/verbatim/<hex-object-name> for > the contents and .git/objects/verbatim/<hex-object-name>.type that > records "blob", "tree", "commit", or "tag" (in practice, I would > expect huge "blob" objects would be the only ones that use this > mechanism). > > The contents will be stored verbatim without compression and without > any object header (i.e., the usual "<type> <length>\0") and the file > could be "ln"ed (or "cow"ed if the underlying filesystem allows it) > to materialize it in the working tree if needed. This sounds like a useful feature. We probably would want to use the "ln" or "cow" every time we use streaming (stream_blob_to_fd() in streaming.h) currently, so hopefully we won't need to increase the number of ways in which we can write an object to the worktree (just change the streaming to write to a filename instead of an fd). > "fsck" needs to be told about how to verify them. Create the object > header in-core and hash that, followed by the contents of that file, > and make sure the result matches the <hex-object-name> part of the > filename, or something like that. Yeah, this sounds like what index-pack is doing - the hash algo can take the contents of one buffer (a header that we synthesize ourselves), and then take the contents of another buffer (the file contents).