Re: [PATCH] worktree: refactor lock_reason_valid and lock_reason to be more sensible

Junio C Hamano <gitster@xxxxxxxxx> · Mon, 29 Oct 2018 12:52:15 +0900

Nickolai Belakovski <nbelakovski@xxxxxxxxx> writes:

> This is an improvement because it fixes an issue in which the fields
> lock_reason and lock_reason_valid of the worktree struct were not
> being populated.

If the field "reason" should always be populated, there is *no*
reason why we need the "valid" boolean.  They work as a pair to
realize lazy population of rarely used field.  The lazy evaluation
technique is used as an optimization for common case, where majority
of operations do not care if worktrees are locked and if so why they
are locked, so that only rare operations that do want to find out
can ask "is this locked and why?" via is_worktree_locked() interface,
and at that point we lazily find it out by reading "locked" file.

So it is by design that these fields are not always populated, but
are populated on demand as book-keeping info internal to the API's
implementation.  It is not "an issue", and changing it is not a
"fix".

In addition, if we have already checked, then we do not even do the
same check again.  If in an earlier call we found out that a worktree
is not locked, we flip the _valid bit to true while setting _reason
to NULL, so that the next call can say "oh, that's not locked and we
can tell that without looking at the filesystem again" [*1*].

You are forcing the callers of get_worktrees() to pay the cost to
check, open and read the "why is this worktree locked?" file for all
worktrees, whether they care if these worktrees are locked or why
they are locked.  Such a change can be an improvement *ONLY* if you
can demonstrate that in the current code most codepaths that call
get_worktrees() end up calling is_worktree_locked() on all worktrees
anyways.  If that were the case, not having to lazily evaluate the
"locked"-ness, but always check upfront, would have a simplification
value, as either approach would be spending the same cost to open
and read these "locked" files.

But I do not think it is the case.  Outside builtin/worktree.c (and
you need to admit "git worktree" is a rather rare command in the
first place, so you shouldn't be optimizing for that if it hurts
other codepaths), builtin/branch.c wants to go to all worktrees and
update their HEAD when a branch is renamed (if the old HEAD is
pointing at the original name, of course), but that code won't care
if the worktree is locked at all.  I do not think of any caller of
get_worktrees() that want to know if it is locked and why for each
and every one of them, and I'd be surprised if that *is* the
majority, but as a proposer to burden get_worktrees() with this
extra cost, you certainly would have audited the callers and made
sure it is worth making them pay the extra cost?

If we are going to change anything around this area, I'd not be
surprised that the right move is to go in the opposite direction.
Right now, you cannot just get "is it locked?" boolean answer (which
can be obtained by a simple stat(2) call) without getting "why is it
locked?" (which takes open(2) & read(2) & close(2)), and if you are
planning a new application that wants to ask "is it locked?" a lot
without having to know the reason, you may want to make the lazy
evaluation even lazier by splitting _valid field into two (i.e. a
"do we know if this is locked?" valid bit covers "is_locked" bit,
and another "do we know why this is locked?" valid bit accompanies
"locked_reason" string).  And the callers would ask two separate
questions: is_worktree_locked() that says true or false, and then
why_worktree_locked() that yields NULL or string (i.e. essentially
that is what we have as is_worktree_locked() today).  Of course,
such a change must also be justified with a code audit to
demonstrate that only minority case of the callers of is-locked?
wants to know why

[Footnote]

*1* The codepaths that want to know if a worktree is locked or not
(and wants to learn the reason) are so rare and concentrated in
builtin/worktree.c, and more importantly, they do not need to ask
the same question twice, so we can stop caching and make
is_worktree_locked() always go to the filesystem, I think, and that
may be a valid change _if_ we allow worktrees to be randomly locked
and unlocked while we are looking at them, but if we want to worry
about such concurrent and competing uses, we need a big
repository-wide lock anyway, and it is the least of our problems
that the current caching may go stale without getting invalidated.
The code will be racing against such concurrent processes even if
you made it to go to the filesystem all the time.