On Thu, Sep 28, 2023 at 02:40:10PM -0700, Jonathan Tan wrote: > Junio C Hamano <gitster@xxxxxxxxx> writes: > > Just wondering if it would help to have the third kind of object > > representation in the object database, sitting next to loose objects > > and packed objects, say .git/objects/verbatim/<hex-object-name> for > > the contents and .git/objects/verbatim/<hex-object-name>.type that > > records "blob", "tree", "commit", or "tag" (in practice, I would > > expect huge "blob" objects would be the only ones that use this > > mechanism). > > > > The contents will be stored verbatim without compression and without > > any object header (i.e., the usual "<type> <length>\0") and the file > > could be "ln"ed (or "cow"ed if the underlying filesystem allows it) > > to materialize it in the working tree if needed. > > This sounds like a useful feature. We probably would want to use the > "ln" or "cow" every time we use streaming (stream_blob_to_fd() in > streaming.h) currently, so hopefully we won't need to increase the > number of ways in which we can write an object to the worktree (just > change the streaming to write to a filename instead of an fd). One thing that scares me about a regular "ln" between the worktree and odb is that you are very susceptible to corrupting the repository by modifying the worktree file with regular tools. If they do a complete rewrite and atomic rename (or link) to put the new file in place, that is OK. But opening the file for appending, or general writing, is bad. You can get some safety with the immutable attribute (which applies to the inode itself, and thus any path that hardlinks to it). But setting that usually requires being root. And it creates other irritations for normal use (you have to unset it before even removing the hardlink). It would be nice if there was some portable copy-on-write abstraction we could rely on, but I don't think there is one. -Peff