On Mon, Mar 10, 2025 at 01:50:36PM -0700, Benjamin Woodruff wrote: > It might help if I start with some more context of why I wrote this > patch. We've got a tool that uses the Rust `vergen-gitcl` crate to call > `git describe --dirty`. You can see that code here: > https://github.com/vercel/next.js/pull/76889/files > > We use `vergen-gitcl` to generate version identifiers for an on-disk > cache. This cache stores results of thousands of functions and has no > backwards compatibility. We want to invalidate it when *any* of the code > changes. `git-describe` felt like a good fit for that, as it gives us a > unique identifier that's still reasonably user-friendly. > > However, we discovered that we'd frequently end up with stale git > lockfiles. This appeared to be due some combination of IDE tools that > run the build in the background (i.e. the rust-analyzer LSP), behavior > that causes builds to sometimes get killed before completion, and the > fact that `git describe --dirty` takes a lock. Yeah, that is not quite the original use case that --no-optional-locks was designed for (i.e., simultaneous contention), but I think it is a reasonable application of the flag. > >>> git describe and git diff may update the index in the background for > >>> similar performance reasons to git-status. > >> > >> That is a wrong reasoning that is completely opposite, though. > >> > >> The commands at the Porcelain level, like "status" and "diff", > >> refresh the index for the CORRECTNESS purposes. > > > > Right, but "status" supports --no-optional-locks already. > > Does this mean the documentation in `git-status` is incorrect? It > implies that the background refresh is only for performance reasons. > That's where I got this idea from: > <https://git-scm.com/docs/git-status#_background_refresh> I think Junio gave an explanation here, so I won't repeat that. But I also think both of us may have been a bit confused about the changes your patches are making, because there's some subtlety. The important thing to keep in mind is that there are _two_ steps: refreshing the in-core index and writing the result out to the on-disk file. With --no-optional-locks we must continue to do the first step (for correctness), and skip the second step. So looking at your patch 1/2 for git-describe, it is doing the right thing: we still call refresh_index() always, and only skip the calls to repo_hold_locked_index() and repo_update_index_if_able(). But one thing that puzzles me is that we read and refresh the index first and only _then_ take a lock. Which seems wrong to me, as we could racily overwrite an intermediate write from somebody else that we never even saw (e.g., imagine you call "git add" at just the wrong moment). That is not a bug in your code, but an existing problem that I think made it harder to understand your change (and probably one we should fix regardless). Your patch 2/2 for git-diff is what I thought was actually wrong, but after digging further, I'm not so sure. In your patch we return early from refresh_index_quietly(), without actually refreshing the in-core index. So I _thought_ that meant we'd produce a wrong answer for something like this: $ touch git.c $ ./git --no-optional-locks diff where we should report "no changes", but would instead find the stat-dirty git.c (just like a plumbing "git diff-files" would). But that doesn't happen! That's because refresh_index_quietly() runs after the diff has completed anyway. The real magic is in diffcore_skip_stat_unmatch(), which processes individual stat-dirty entries and suppresses them (when there's no actual content change). So the call in refresh_index_quietly() really is just about updating what we're about to write out, and your patch is correct to bail from the whole function (if we are not writing it out, there is no purpose in refreshing at that point). So as far as I can tell the patches are doing the right thing. But I think the commit messages probably need to describe those subtleties and argue that the change is correct. Bonus points if a preparatory patch fixes the race in git-describe. ;) > It's also worth noting that libgit2 does not do this background refresh > by default (`GIT_DIFF_UPDATE_INDEX` and `GIT_STATUS_OPT_UPDATE_INDEX`). > I think that makes sense for libgit2's typical use-cases, but it is a > divergence in behavior. Yes, I think that is probably a reasonable default for libgit2, where you'd expect everything to happen in a single process (that can share the in-core index). -Peff