Re: On ref locking

Junio C Hamano <junkio@xxxxxxx> · Mon, 25 Sep 2006 21:11:27 -0700

Daniel Barkalow <barkalow@xxxxxxxxxxxx> writes:

> On Mon, 25 Sep 2006, Junio C Hamano wrote:
>...
>> What the latter means is that an updater:
>> 
>>  (1) first learns the current value of the $ref, without
>>      locking;
>> 
>>  (2) decides based on the knowledge from (1) what the next value
>>      should be;
>> 
>>  (3) gets $ref.lock, makes sure $ref still is the value it
>>      learned in (1), updates it to the desired value and
>>      releases the lock.
>> 
>> The above 3-step sequence prevents updater-updater races with an
>> extremely short critical section.  We only need to hold the lock
>> while we do compare and swap.
>
> I remember having a certain amount of disagreement over whether it's 
> better to actually take the lock in (1), and hold it through (2), or only 
> verify that it didn't change in (3) after taking the lock for real. If 
> there's nothing substantial going on in (2), it doesn't matter. If users 
> are producing content (e.g., git tag), they may want to make sure that 
> nobody else is in the middle of writing something that would conflict.
>
> I think I'd advocated getting the lock early if you're going to want it, 
> and I don't remember what the convincing argument on the other side was 
> for the cases under consideration at the time, beyond it not mattering 
> significantly.
>
> Note that we make sure to remove the lock when aborting due to signals 
> (assuming we get a chance), so the size of the critical section doesn't 
> matter too much to the chance of it getting left locked inappropriately. 
> Of course, this is harder to do with the core code for a shell script than 
> for C code.

I do not recall that arguments on the list but what you say
makes sense.  Taking a lock early makes implementation more
cumbersome on the scripting side but reduces the annoyance
factor.  You will be annoyed if the tool tells you somebody else
did something to the ref you were going to update and your
intended operation is blocked after carefully editing your tag
message or commit log message.

However, I think the distributed nature of git makes it less of
an issue for normal repositories.

Shared distribution point repositories (where everybody pushes
to and fetches from) needs to protect between updaters lest that
we lose commits (fetching a rev older while a push is in
progress is not an error -- re-feching would give you the final
result), and that is what Johannes's recent patch does.  Pushing
into a repository with working tree does have a lot of issues
(e.g. push to update the current branch head while a commit is
in progress) but doing that as the workflow has problems for
other reasons as well (i.e. push to update the current branch
would make the index out of sync even when no commit is in
progress).  It is strongly discouraged to share $GIT_DIR/refs,
just like in any halfway sane development workflow people do not
share working tree files.

So "locking before letting the user spend effort so that we
would not later have to say 'sorry, we cannot let you do that
because somebody else updated the ref in the meanwhile'" is
technically not incorrect, and we should be able to make things
work that way, but I think it pretty much is a "why bother"
issue.  "git tag" checks if the new tag being created does not
exist to reduce the annoyance factor, and the check does not
even attempt to avoid a race.  If we revalidated that the tag is
still not there while storing the tag object name to the ref in
the final step, I think that is good enough, and I think having
"lock during operation" buys us much real-life benefit.

git-commit has the same attitude, and I think it is the right
balance.  It checks if the commit being created is an initial
commit, a single-parent commit or a merge first, prepares "-p"
arguments to the commit-tree that will happen much later, then
lets the user prepare the log message.  Theoretically somebody
else could update the current HEAD in the meantime and the
resulting commit can have ancestors that are different from what
was intended.  We do check and avoid it by making sure that the
HEAD is still where we thought it would be before doing the
final.

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html