On Tue, May 24, 2022 at 12:39:11PM -0700, Jonathan Nieder wrote: > Ævar Arnfjörð Bjarmason wrote: > > On Fri, May 20 2022, Taylor Blau wrote: > > On Mon, Nov 29 2021, Taylor Blau wrote: > > > > +static void write_mtimes_header(struct hashfile *f) > > > +{ > > > + hashwrite_be32(f, MTIMES_SIGNATURE); > > > + hashwrite_be32(f, MTIMES_VERSION); > > > + hashwrite_be32(f, oid_version(the_hash_algo)); > > > +} > [...] > > But since this is a new format I think it's worth considering not using > > the 1 or 2 you get from oid_version(), but the "format_id", > > i.e. GIT_SHA1_FORMAT_ID or GIT_SHA256_FORMAT_ID. > > > > You'll use the same space in the format for it, but we'll end up with > > something more obvious (as the integer encodes the sha1 or sha256 name). > > Agreed. I know we recommend using the format_id for on-disk formats, but I think there is enough existing uses of "1" or "2" that either are acceptable in practice. E.g., grepping around for "hashwrite.*oid_version", there are three existing formats that use "1" or "2" instead of the format_id. They are: - the commit-graph format - the midx format - the .rev format Moreover, I can't seem to find any formats that _don't_ use that convention. So I have a vague preference towards using the values "1" and "2" as we currently do in these patches. (TBH, I don't find "sha1" significantly more interpretable than just "1", so I would be just as happy leaving it as-is). > [...] > > Other than that the only question I have (I think) on this series is if > > Jonathan Nieder is happy with it. I looked back in my logs and there was > > an extensive on-IRC discussion about it at the end of March, which ended > > in you sending: https://lore.kernel.org/git/YkICkpttOujOKeT3@nand.local/ > > > > But it seems Jonathan didn't chime in since then, and he had some major > > issues with the approach here. I think those should have been addressed > > by that discussion, but it would be nice to get a confirmation. > > I would still prefer if this used a repository format extension, but > that preference is not strong enough that I'd say "this must not go in > without one". What I think would help would be some information in > the user-facing documentation for commands that create and work with > cruft packs. In other words, if our take on people sharing > repositories between implementations that understand and don't > understand cruft packs and get objects moving back and forth between > packed and loose objects is "you should have known you were doing > something strange", the least we can do is to warn them. I think that's a good suggestion. We already have some documentation in Documentation/technical/cruft-packs.txt, but I think it could be helpful to add user-facing documentation, too. Would you be opposed to doing that outside of this series? ISTM that the technical discussion has mostly settled, so I'd rather wordsmith the user-facing documentation separately. > I don't see a config to enable PACK_CRUFT by default yet in this > series. I'd like one, so that people can turn it on and get the good > new behavior. :) `git gc` has support for this (c.f., "gc.cruftPacks"). `git repack` requires you to pass `--cruft`; IIRC I originally had a similar configuration in `git repack` which would change the behavior of `-A` / `-a` when set, but I found it too confusing and scrapped it. Thanks, Taylor