Re: [PATCH 0/2] Ensure unique worktree ids across repositories

Caleb White <cdwhite3@xxxxx> · Fri, 29 Nov 2024 03:31:09 +0000

On Thu Nov 28, 2024 at 9:14 PM CST, Junio C Hamano wrote:
> Caleb White <cdwhite3@xxxxx> writes:
>
>> The `es/worktree-repair-copied` topic added support for repairing a
>> worktree from a copy scenario. I noted[1,2] that the topic added the
>> ability for a repository to "take over" a worktree from another
>> repository if the worktree_id matched a worktree inside the current
>> repository which can happen if two repositories use the same worktree name.
>
> Problem worth solving.  Another would be to fail if the worktree ID
> proposed to be used is already in use, but the ID is supposed to be
> almost invisible (unless the user is doing some adiministrative work
> on the repository), generating a unique ID is a good approach.

There's already a `while` loop that tries incrementing the proposed id
(e.g., `develop1` if `develop` is taken). However, this is on
a per-repository basis, so there's no way to know what's already in use
in another repository. The problem arises when trying to run `worktree
repair` on a worktree that has a matching id (like `develop`) from
another repository. The goal here is that the same worktree name can be
created in different repositories and the id will be (effectively)
unique.

>> This series teaches Git to create worktrees with a unique suffix so
>> that the worktree_id is unique across all repositories even if they have
>> the same name. For example creating a worktree `develop` would look like:
>>
>>     foo/
>>     ├── .git/worktrees/develop-5445874156/
>>     └── develop/
>>     bar/
>>     ├── .git/worktrees/develop-1549518426/
>>     └── develop/
>>
>> The actual worktree directory name is still `develop`, but the
>> worktree_id is unique and prevents the "take over" scenario. The suffix
>> is given by the `git_rand()` function, but I'm open to suggestions if
>> there's a better random or hashing function to use.
>
> I do not think it matters much what hash/rand algorithm is chosen.
> What is important is what you do when the suffix suggested by that
> chosen algorithm collides with an existing worktree ID.  IOW, there
> is no way a "random" can guarantee uniqueness.  Attempt to create and
> if you find a collision, retry from the generation of another suffix,
> or something like that, is necessary.
>
> And as long as that "make sure it is unique" part is done right, it
> does not even have to be random.  Just generating a sequence number
> and using the first one that is available would work as well.

The `while` loop mentioned earlier still exists, so in the (unlikely)
event that the suffix collides with an existing worktree_id, it will
increment the suffix and try again.

Best,

Caleb