Re: [PATCH v2 38/43] refs: make some files backend functions public

Michael Haggerty <mhagger@xxxxxxxxxxxx> · Wed, 07 Oct 2015 18:00:20 +0200

On 10/07/2015 03:25 AM, David Turner wrote:
> On Mon, 2015-10-05 at 11:03 +0200, Michael Haggerty wrote:
>> On 09/29/2015 12:02 AM, David Turner wrote:
>>> Because HEAD and stash are per-worktree, other backends need to
>>> go through the files backend to manage these refs and their reflogs.
>>>
>>> To enable this, we make some files backend functions public.
>>
>> I have a bad feeling about this change.
>>
>> Naively I would expect a reference backend that cannot handle its own
>> (e.g.) stash to instantiate internally a files backend object and to
>> delegate stash-related calls to that object. That way neither class's
>> interface has to be changed.
>>
>> Here you are adding a separate interface to the files backend. That
>> seems like a more complicated and less flexible design. But I'm open to
>> be persuaded otherwise...
> 
> After some thought, here's a summary of the problem:
> 
> Some writes are cross-backend writes.  For example, if HEAD is symref to
> refs/head/master, a commit is a cross-backend write (HEAD itself is not
> updated, but its reflog is).  Ronnie's design of the ref backend
> structure did not account for cross-backend writes, because we didn't
> have per-worktree refs at the time (there was only HEAD, and there was
> only one copy of it).
> 
> Cross-backend writes are complicated because there is no way to tell a
> backend to do only part of a ref update -- for instance, to tell the
> files backend to update HEAD and HEAD's reflog but not
> refs/heads/master.  Maybe we could set a flag that would do this, but
> the synchronization would be fairly complicated.  For instance, an
> update to HEAD might need to confirm the old sha for HEAD, meaning that
> we couldn't do the db write first.  But if we move the db write second,
> then when the db code goes to do its check of the HEAD sha, it might see
> a new value.  Perhaps there's a way to make it work, but it seems
> fragile/complex.
> 
> Right now, for cross-backend reads/writes, the lmdb code cheats. It
> simply does the write directly and immediately.  This means that these
> portions of transactions cannot be rolled back.  That's clearly bad. 

That's a really good point.

I hate to break it to you, but the handling of symrefs in Git is already
a mess. HEAD is the only symref that I would really trust to work
correctly all the time. So I think that changes needn't be judged on
whether they handle symrefs perfectly. They should just not break them
in any dramatic new ways.

So, you pointed out the problem that HEAD (a per-worktree reference) can
be a symref that points at a shared reference. In fact, I think when
HEAD is symbolic it is only allowed to point at a branch under
refs/heads, so this particular problem is pretty well-constrained.

Are there other cases of cross-backend writes? I suppose there could be
a symref elsewhere among the per-worktree references that points at a
shared reference. But I can't think of any cases where this is done by
standard Git. Not that it is forbidden; I just don't think it is done by
any of the standard tools.

Or there could be a symref among the shared references that points at a
per-worktree reference. But AFAIK the only other symrefs that are in
common use are the refs/remotes/*/HEAD symrefs, and they always point at
references within the same (shared) namespace.

If everything that I've said is correct, then my opinion is that it
would be perfectly adequate if your code would handle the specific case
of HEAD (by hook or by crook), and if there are any other cross-backend
symrefs, just die with a message stating that such usage is unsupported.
Junio, do you think that would be acceptable?

> The simplest solution would be for the lmdb code to simply acquire
> locks, and write to lock files, and then commit those lock files just
> before the db transaction commits. Then the lmdb code would handle all
> of the orchestration without the files backend having to be rewritten to
> handle this case.

Wouldn't that essentially be re-implementing the files backend? I must
be missing something.

> [...]

BTW I just realized that if one backend should delegate to another, then
the primary backend should be the per-worktree backend and it should
delegate to the common backend. I think I described things the other way
around in my earlier message. This makes more sense because it is
acceptable for per-worktree references to refer to common references but
not vice versa.

Michael

-- 
Michael Haggerty
mhagger@xxxxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html