Re: [PATCH 8/8] reflog_expire(): lock symbolic refs themselves, not their referent

Michael Haggerty <mhagger@xxxxxxxxxxxx> · Fri, 13 Feb 2015 17:26:04 +0100

On 02/12/2015 07:04 PM, Stefan Beller wrote:
> On Thu, Feb 12, 2015 at 8:52 AM, Michael Haggerty <mhagger@xxxxxxxxxxxx> wrote:
>> However, another important question is whether other Git implementations
>> have copied this unusual locking policy. If so, that would be a reason
>> not to change it. I just pinged the libgit2 maintainer to find out their
>> policy. Maybe you could find out about JGit?
> 
> I could not really find reflog expiration for jgit for a while, but
> then I stumbled upon this:
> 
>     // TODO: implement reflog_expire(pm, repo);
> 
> so I think it's safe to change it in git.git for now.

Stefan: This locking policy doesn't only affect "reflog expire"; we also
need to lock the reflog when we add a new entry to it, for example when
updating refs/heads/master through HEAD. Do you know what JGit locks in
that scenario?

I had a discussion in IRC [1] with Carlos Martín Nieto about how libgit2
handles this:

* When libgit2 updates a reference through a symref, it locks the
pointed-to reference while updating the reflog for the symbolic ref. So,
for example, "git commit" would hold refs/heads/master.lock while
updating both logs/refs/heads/master and logs/HEAD. (This matches the
current Git behavior.)
* libgit2 doesn't support "reflog expire", though it does have an API
that allows users to overwrite the reflog with a specified list of
entries. That API locks the reference itself; i.e., HEAD.lock. This
disagrees with Git's current behavior.

I also realized that Git's current policy is probably not tenable if one
process is re-seating a symref at the same time as another process is
expiring its reflog. The "git reflog expire HEAD" might grab
"refs/heads/master.lock" then start rewriting "logs/HEAD". Meanwhile,
"git checkout foo" would grab "HEAD.lock" (not being blocked by the
"expire" process), then rewrite it to "ref: refs/heads/foo", then grab
"refs/heads/foo.lock" before updating "logs/HEAD". So both processes
could be writing "logs/HEAD" at the same time.

In fact, it's even worse. "git checkout foo" and "git symbolic-ref HEAD
refs/heads/foo" release "HEAD.lock" *before* updating logs/HEAD--in
other words, they append to logs/HEAD without holding *any* lock.

What is the best way forward?

I think the current locking policy is untenable, even aside from the bug
in "symbolic-ref".

Switching to holding only "HEAD.lock" while updating "logs/HEAD" is the
right solution, but it would introduce an incompatibility with old
versions of Git and libgit2 (and maybe JGit?) Probably nobody would
notice, but who knows?

Therefore, I propose that we hold *both* "HEAD.lock" *and*
"refs/heads/master.lock" when modifying "logs/HEAD" (even when running
"reflog expire" or "reflog delete"). I think this will be compatible
with old versions of Git and libgit2, and it will also fix some design
problems mentioned above. Plus, it will prevent updates to the two
reflogs from appearing out of order relative to each other.

Someday, when old clients have been retired, we can optionally switch to
locking *only* "HEAD.lock" when modifying "logs/HEAD".

What do people think?
Michael

[1] https://github.com/libgit2/libgit2/issues/2902

-- 
Michael Haggerty
mhagger@xxxxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html