Re: [PATCH 0/2] describe and diff: implement --no-optional-locks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 06, 2025 at 08:11:09AM -0800, Junio C Hamano wrote:

> "Benjamin Woodruff via GitGitGadget" <gitgitgadget@xxxxxxxxx>
> writes:
> 
> > git describe and git diff may update the index in the background for similar
> > performance reasons to git-status.
> 
> That is a wrong reasoning that is completely opposite, though.
> 
> The commands at the Porcelain level, like "status" and "diff",
> refresh the index for the CORRECTNESS purposes.

Right, but "status" supports --no-optional-locks already.

So I think these patches are wrong, but the goal is reasonable. What
git-status does with --no-optional-locks is to update the index
internally for its _own_ use (giving it the correct results), but not to
lock nor write out the resulting index (to avoid conflicting with other
running programs). So it's pessimal (losing the opportunity to share
what it learned) but prevents lock contention.

And I think other programs could do that. I even wrote back in
27344d6a6c (git: add --no-optional-locks option, 2017-09-27):

  I've punted here on finding more callers to convert, since "status" is
  the obvious one to call as a repeated background job. But "git diff"'s
  opportunistic refresh of the index may be a good candidate.

I must admit that I didn't imagine "describe" is something that somebody
would run a lot in the background, but there's not really any harm in
having it support optional locks if somebody cares about that case.

> The commands at the plumbing level, which are designed to be used in
> your own scripts, like "diff-files" and "diff-index", do not refresh
> the index for the performance purposes.  If your own script wants to
> produce correct result, your script IS responsible for refreshing
> the index after its last modification to working tree files before
> it starts to use the plumbing commands to inspect which ones are
> modified and which ones are not.  This is so that your script has
> more control over when the index is refreshed.  It does not have to
> pay cost to run refresh for each Git command it invokes, if it knows
> that it does not make any modification between the two invocations;
> it can instead refresh just once and then run these two plumbing
> commands.

Assuming --no-optional-locks is being used as intended, the idea is that
the script cannot touch the index at all, because it is running in the
background and does not want lock contention with things the user is
doing in the foreground. So it cannot naively do a single index refresh
followed by plumbing commands like diff-files. Either it must:

  - accept that diff-files might return stat-dirty results (yuck)

  - use its own index that is separate from the regular .git/index file.
    But that may be overly slow, since the index "update" would rewrite
    the whole thing from scratch. Of course all of our index writes are
    from scratch, but you'd pay the price even when there is nothing to
    update.

  - use a command which operates all in a single process, with an
    in-memory index that is updated but not written out (e.g., "git
    --no-optional-locks status").

-Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux