On Wed, Jan 18, 2023 at 04:38:40PM -0500, Taylor Blau wrote: > On Wed, Jan 18, 2023 at 12:59:24PM -0800, Junio C Hamano wrote: > > The --literally option was invented initially primarily to allow a > > bogus type of object (e.g. "hash-object -t xyzzy --literally") but I > > am happy to see that we are finding different uses. I wonder if > > these objects of known types but with syntactically bad contents can > > be "repack"ed from loose into packed? > > > > > [5/6]: fsck: provide a function to fsck buffer without object struct > > It is indeed possible: > > --- >8 --- > Initialized empty Git repository in /home/ttaylorr/src/git/t/trash directory.t9999-test/.git/ > expecting success of 9999.1 'repacking corrupt loose object into packed': > name=$(echo $ZERO_OID | sed -e "s/00/Q/g") && > printf "100644 fooQ$name" | q_to_nul | > git hash-object -w --stdin -t tree >in && > > git pack-objects .git/objects/pack/pack <in > > Enumerating objects: 1, done. > Counting objects: 100% (1/1), done. > 06146c77fd19c096858d6459d602be0fdf10891b > Writing objects: 100% (1/1), done. > Total 1 (delta 0), reused 0 (delta 0), pack-reused 0 > ok 1 - repacking corrupt loose object into packed > --- 8< --- Right, we don't do any fsck-ing when packing objects. Nor should we, I think. We should be checking objects when they come into the repository (via index-pack/unpack-objects) or when they're created (hash-object), but there's little need to do so when they migrate between storage formats. The fact that "--literally" manually writes a loose object is mostly an implementation detail. I think if we are not writing an object with an esoteric type, that it could even just hit the regular index_fd() code path (and drop the HASH_FORMAT_CHECK flag). If you do write one with "-t xyzzy", I think pack-objects would barf, but not because of fsck checks. It just couldn't represent that type (which really makes such objects pretty pointless; you cannot ever fetch or push them!). -Peff