On Tue, Jan 28, 2020 at 04:56:26PM +0100, Han-Wen Nienhuys wrote: > JGit currently implements what we have here, as this is what's spelled > out in the spec that Shawn posted back in the day. It's probably > acceptable to this, though, as the reftable support has only landed in > JGit very recently and will probably appear very experimental to > folks. > > How would the layout be then? We'll have > > HEAD - dummy file > reftable/ - the tables > refs/ - dummy dir > > where shall we store the reftable list? maybe in a file called > > reftable-list > > If we have both HEAD/refs + (refable/reftable-list), what should we > put there to ensure that no git version actually manages to use the > repository? (what happens if someone deletes the version setting from > the .git/config file) Yeah, it would be nice to have something that an older version of Git would totally choke on, but I'm not sure we have a lot of leeway. What we put in HEAD has to be syntactically legitimate enough to appease validate_headref(), so our options are either "ref: refs/something/bogus" or an object hash that we don't have (e.g., 0{40}). The former would be preferable because it would (in theory) prevent us from writing to HEAD, as well. I wondered what would happen if you put in a syntactically invalid ref, like "ref: refs/.not/.valid" (leading dots are not allowed in path components of refnames). It does cause _some_ parts of Git to choke, but sadly "git update-ref HEAD $sha1" actually writes to .git/refs/.not/.valid. Even "refs/../../dangerous" doesn't give it pause. Yikes. It seems we're pretty willing to accept symref destinations without further checking. Making "refs" a file instead of a directory does work nicely, as any attempts to read or write would get ENOTDIR. And we can fool is_git_directory() as long as it's marked executable. That's OK on POSIX systems, but I'm not sure how it would work on Windows (or maybe it would work just fine, since we presumably just say "yep, everything is executable"). So perhaps that's enough, and what we put in HEAD won't matter (since nobody will be able to write into refs/ anyway). > > But that raises a question: how ready are reftables to handle non-sha1 > > object ids? I see a lot of GIT_SHA1_RAWSZ, and I think the on-disk > > format actually has binary sha1s, right? In theory if those all become > > the_hash_algo->rawsz, then it might "Just Work" to read and write > > slightly larger entries. > > The format fixes the reftable at 20 bytes, and there is not enough > framing information to just write more data. We'll have to encode the > hash size in the version number somehow, eg. we could use the higher > order bit of the version byte to encode it, for example. > > But it needs a new version of the spec. I think it's premature to do > this while v1 of reftable isn't in git-core yet. I don't know that we technically need the reftables file to say how long the hashes are. The git config will tell us which hash we're using, and everything else is supposed to follow. So I think it would work OK as long as you're able to be told by the rest of Git that hashes are N bytes, and just use that to compute the fixed-size records. That said, it might make for easier debugging if the reftables file declares the size it assumes. -Peff