On Wed, Dec 06, 2023 at 03:10:48PM -0500, Jeff King wrote: > On Wed, Nov 29, 2023 at 11:13:18AM +0100, Patrick Steinhardt wrote: > > > As I'm currently working on the reftable backend this thought has also > > crossed my mind. The reftable backend doesn't only create "refs/", but > > it also creates "HEAD" with contents "ref: refs/heads/.invalid" so that > > Git commands recognize the Git directory properly. Longer-term I would > > really love to see us doing a better job of detecting Git repositories > > so that we don't have to carry this legacy baggage around. > > > > I can see different ways for how to do this: > > > > - Either we iterate through all known reference backends, asking > > each of them whether they recognize the directory as something > > they understand. > > > > - Or we start parsing the gitconfig of the repository so that we can > > learn about which reference backend to expect, and then ask that > > specific backend whether it thinks that the directory indeed looks > > like something it can handle. > > > > I'd personally prefer the latter, but I'm not sure whether we really > > want to try and parse any file that happens to be called "config". > > We do eventually parse the config file to pick up repositoryFormatVersion. > But there's sort of a chicken-and-egg here where we only do so after > gaining some confidence that it's a repo directory. :) > > I actually think the "ask each backend if it looks plausible" is > reasonable, at least for an implementation that knows about all > backends. Though what gives me pause is how older versions of Git will > behave with a new-format repository that does not have a "refs" > directory. > > There are really two compatibility checks. In is_git_directory(), we > want to say "is this a repo or not". And then later we parse the config, > make sure the repository format is OK, and that we support all > extensions. So right now, an older version of Git that encounters a > reftable-formatted repo (that has a vestigial "refs/" directory) says > "ah, that is a repo, but I don't understand it" (the latter because > presumably the repo version/extensions in .git/config are values it > doesn't know about). But if we get rid of "refs/", then older versions > of Git will stop even considering it as a repo at all, and will keep > searching up to the ceiling directory. So either: > > 1. They'll find nothing, and you'll get "you're not in a git repo", > rather than "you're in a git repo, but I don't understand it". > Which is slightly worse. > > 2. They'll find some _other_ containing repo. Which could be quite > confusing. > > So forgetting at all about how we structure the code, it seems to me > that the problem is not new code, but all of the existing code which > looks for access("refs", X_OK). True. The question is of course how much value there is in an old tool to be able to discover a new repository that it wouldn't be able to read in the first place due to it not understanding the reference format. So I'd very much like to see that eventually, we're able to get rid of "legacy" cruft that doesn't serve any purpose anymore. The question is whether we can do a better job of this going forward so that at least we don't have to pose the same question in the future. Right now, we'll face the same problem whenever any part of the current on-disk repository data structures changes. I wonder whether it would make sense to introduce something like a filesystem-level hint, e.g. in the form of a new ".is-git-repository" file. If Git discovers that file then it assumes the directory to be a Git repository -- and everything else is set up by parsing the config and thus the repository's configured format. > I dunno. Maybe that is being too paranoid about backwards compatibility. > People will have to turn on reftable manually, at least for a while, and > would hopefully know what they are signing up for, and that old versions > might not work as well. And by the time a new format becomes the > default, it's possible that those older versions would have become quite > rare. > > > Just throwing this out there, but we could use this as an excuse to > > introduce "extensions.refFormat". If it's explicitly configured to be > > "reffiles" then we accept repositories even if they don't have the > > "refs/" directory or a "packed-refs" file. This would still require work > > in alternative implementations of Git, but this work will need to happen > > anyway when the reftable backend lands. > > > > I'd personally love for this extension to be introduced before I'm > > sending the reftable backend upstream so that we can have discussions > > around it beforehand. > > We already have an extension config option to specify that we're using > reftable, don't we? But anything in config has the same chicken-and-egg > problems as above, I think. Not yet, no. I plan to submit the new "extensions.refFormat" extension soonish though, probably next week. Patrick
Attachment:
signature.asc
Description: PGP signature