On Fri, May 03, 2024 at 01:55:53PM -0400, Jeff King wrote: > On Tue, Apr 30, 2024 at 01:25:32PM +0200, Patrick Steinhardt wrote: > > > > So this is where I will show my ignorance of reftables. I assume it > > > still has to implement FETCH_HEAD as a file (since it holds extra data). > > > But does it do the same for other names outside of "refs/"? I am > > > assuming not in the paragraph below. > > > > No, that's why we originally introduced the "special refs" syntax, as > > defined in gitglossary(7). There are only two files that behave like > > refs, but circumvent the ref backend: FETCH_HEAD and MERGE_HEAD. Both of > > these have special syntax and carry additional metadata, and as such > > they cannot be stored generically in a ref backend. > > > > All other root refs are stored via the ref backend. > > OK, that matches what I guessed based on the existence of special refs. ;) > Thanks for confirming. > > Part of me does wonder if things would be simpler if ref backends only > handled refs/*, and pseudo/special/root refs remained as their own thing > in the filesystem. They're a limited set, so we don't really care about > scaling in the same way. And their point is to be somewhat ephemeral, so > even if you wanted to be clever with a replicated database-backed refs > store, you probably don't care if CHERRY_PICK_HEAD goes away. I think this would have several downsides: - You cannot perform atomic updates and reads of the whole repository's ref state. Overall, the whole ref namespace is fully contained by the ref database. - Not having those loose refs can improve security because you do not have to parse arbitrary paths in the Git repository, and those will not contain arbitrary information or even be symbolic links in case `core.preferSymlinkRefs` is set. - Every file that is not a ref needs special treatment for garbage collection. - There is a weird mismatch where some refs can be surfaced via tooling whereas others can't really. You either cannot use normal plumbing commands to access those refs, or you must create hacks in the ref layer. Any of those hacks is only going to be a partial solution, and the cases in which reading those files as refs doesn't work stick out like a sore thumb. - Conceptually, on the UX side, it's totally weird that some refs are more special than others. This is quite hard to explain to our users. I see it as a benefit that we're now finally cleaning up this mess and make things a lot more straight-forward. Now I don't fully disagree with what you're saying: I wish that a lot of the state was more self-contained to the particular subsystem. The git-bisect(1) state is a prime example, where we clutter the gitdir with various different files. But the end goal in my opinion should be that something is either a proper ref, in which case it is stored in the ref backend, or it is not and cannot ever be accessed as one. The current in-between state is just plain weird. > And it's not clear to me what the path forward is for scripts which poke > at .git/* to determine repo state. For example, I think git-prompt.sh > looks at CHERRY_PICK_HEAD and REVERT_HEAD to decide what we're doing. They shouldn't, in my opinion. It's one of the consequences of accepting multiple ref backends into Git: tooling must not assume the on-disk file format, and they should use Git plumbing commands to access the data instead. I have already updated git-prompt.sh to do so. > Maybe we just roll all of that into a command which returns all details > of the repo state? That indeed is something I have been thinking about quite a lot recently and that I would certainly love to see. Making the state as discussed here more visible would be nice. It would also allow us to fix the weirdness that git-rev-parse(1) has become. Its scope has gone way beyond parsing revs due to all those weird modes where it exercises the repository's state. Those are needed, sure, and we didn't have a better place to put those in the past. But ideally, things like `--local-env-vars` or `--resolve-git-dir` have no reason to exist in git-rev-parse(1) at all. So if we had a new plumbing command that allows us to query a repository for repository-level information it would just be natural to move those over. We do have this in our backlog at GitLab, but didn't yet get to it. Patrick
Attachment:
signature.asc
Description: PGP signature