Re: Confusing treatment of "head" in worktree on case-insensitive FS

Jeff King <peff@xxxxxxxx> · Mon, 1 Jul 2024 14:28:24 -0400

On Mon, Jul 01, 2024 at 02:17:21PM +0100, Phillip Wood wrote:

> On 01/07/2024 04:31, Jeff King wrote:
> > On Sat, Jun 29, 2024 at 10:39:29AM -0400, Julia Evans wrote:
> > > $ git init
> > > $ git commit --allow-empty -m'test'
> > > $ git worktree add /tmp/myworktree
> > > $ cd /tmp/myworktree
> > > $ git commit --allow-empty -m'test'
> > > $ git rev-parse head
> > > adf59ca8da0ee5c4eb455f87efecc6c79eaf030f
> > > $ git rev-parse hEAd
> > > adf59ca8da0ee5c4eb455f87efecc6c79eaf030f
> > > $ git rev-parse HEAD
> > > fbb28196d08d74aa61f65e5cee3cb11cc24c338a
> > 
> > I admit this is an unexpected case, as I'd expect both on-disk files to
> > be spelled "HEAD". I didn't dig into the details, though, so it's
> > possible there's something we could be doing differently or better. But
> > I do think it's mostly the tip of the iceberg for case-insensitivity
> > issues with refs.
> 
> I think what's happening is that the checks in is_current_worktree_ref() are
> case sensitive so "head" is not treated as local to the current worktree and
> ends up being resolved in the main worktree. I guess we could make those
> checks case-insensitive but as you say it'd only be dealing the tip of the
> iceberg.

Ah, right, that makes perfect sense (well, why it happens that way, not
from the perspective of a user :) ).

So one thing we could do (but I am not sure is wise) is for those checks
to become case-insensitive for a case-insensitive ref store. And then at
least if you use consistent case when writing refs, you should get
reasonable behavior (whereas if you make "hEaD" and "HEAD" yourself, all
bets are off). But I'd worry about opening up even more weird corner
cases. And you can already avoid this problem (I think) by using the
case-sensitive spelling "HEAD" on lookups.

> On a related note do MacOs and Windows do any unicode normalization when
> looking up filenames? If so then that's probably another can of worms for
> filesystem based refs.

At least macOS does. That's why we have all of the precompose-unicode
code, which tries to normalize arguments to match what the OS will do.
In theory we could do something like that for case normalizing, but I
don't think it's nearly as simple.

For a read, normalizing "head" to "HEAD" on a case-insensitive
filesystem is OK, since the OS will return the same set for each group.

But writing is harder. The unicode normalization in the filesystem is
not "preserving". So if I pass in a decomposed string, the filesystem is
going to silently convert it to the precomposed form anyway. But case is
usually preserving. So if I write "hEaD", I'll get that in the
filesystem, and not actually "HEAD". I dunno. Maybe it would be OK if we
did that only for root refs which would otherwise be forbidden. But it
really feels like opening up a can of complexity worms and corner cases.

-Peff