Re: git add -p is slow for sparse checkout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 6, 2024 at 9:53 AM Manoraj K <mkenchugonde@xxxxxxxxxxxxx> wrote:
>
> ++ CC
>
> On Fri, Dec 6, 2024 at 9:16 AM Manoraj K <mkenchugonde@xxxxxxxxxxxxx> wrote:
> >
> > Hi team,
> >
> > I am currently on a sparse checkout repository, and it looks like the
> > `git add—p` command is much slower than in a full checkout repository.
> >
> > When run, the `git add—p` command expands to the full index and takes
> > more time to start showing the patch.
> >
> > Also, we have git trace2 enabled, and the timings of the subcommands
> > gave us an indication that the `git apply—-cached` subcommand is
> > taking way too long, with an average P50 of 3000ms against 300ms in
> > the full checkout repository. The other notable spikes are in `git
> > diff-files --color` (180ms vs 80ms) and `git diff-files --no-color`
> > (180ms vs 80ms).
> >
> > I am really interested in understanding why `git add -p` expands to a
> > full index, which I believe is the issue with start-up regression.
> > Also, would be great to understand the difference in performance of
> > the subcommands.

I started tracing execution of `git add -p`, it seems that `git add
-p` seems to require a
full index whereas a regular `git add` does not, because `git add -p`
ends up calling
 `do_read_index` in `read_cache.c`.


`do_read_index` seems to then call `prepare_repo_settings` that resets
the value of
`istate->repo.settings.command_requires_full_index` to `1`, even though
`istate->repo.settings.command_requires_full_index` passed to
`do_read_index` was
initially `0` for `git add -p`.

```
int do_read_index(struct index_state *istate, const char *path, int must_exist)
{
....
/*
* If the command explicitly requires a full index, force it
* to be full. Otherwise, correct the sparsity based on repository
* settings and other properties of the index (if necessary).
*/
prepare_repo_settings(istate->repo);
```

The comment here seems confusing to me — it seems like the intent here
was to preserve
sparseness when as possible, but `prepare_repo_settings` seemsto do a
blanket reset —

```
void prepare_repo_settings(struct repository *r)
{
....
/*
* This setting guards all index reads to require a full index
* over a sparse index. After suitable guards are placed in the
* codebase around uses of the index, this setting will be
* removed.
*/
r->settings.command_requires_full_index = 1;
```

Retaining the original setting seems to fix the problem and makes `git
add -p` fast,
however I don't know if there's a reason why it wasn't already done this way.

```
read-cache.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 01d0b3ad22..92f7561a88 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2339,7 +2339,12 @@ int do_read_index(struct index_state *istate,
const char *path, int must_exist)
* to be full. Otherwise, correct the sparsity based on repository
* settings and other properties of the index (if necessary).
*/
+ int previous_full_index_setting =
+ istate->repo->settings.command_requires_full_index;
+
prepare_repo_settings(istate->repo);
+ istate->repo->settings.command_requires_full_index =
previous_full_index_setting;
+
if (istate->repo->settings.command_requires_full_index)
ensure_full_index(istate);
else
```





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux