Re: [PATCH] advice: warn when sparse index expands

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/5/24 4:29 PM, Elijah Newren wrote:
On Wed, Jul 3, 2024 at 8:14 AM Derrick Stolee via GitGitGadget
<gitgitgadget@xxxxxxxxx> wrote:

From: Derrick Stolee <stolee@xxxxxxxxx>

Typically, forcing a sparse index to expand to a full index means that
Git could not determine the status of a file outside of the
sparse-checkout and needed to expand sparse trees into the full list of
sparse blobs. This operation can be very slow when the sparse-checkout
is much smaller than the full tree at HEAD.

Yep, I'm with you here.

When users are in this state, it is common that 'git status' will report
the problem.

I struggled to understand this sentence in combination with your later
statements, though that may only be because I had some difficulty with
later parts of the commit message.  Perhaps addressing the later parts
will make this sentence fine as-is, but it's possible this sentence
could do with a bit more detail.

I think also the issue with this sentence is that its thought isn't
complete until also reading the next one. These can be combined to
get to the point faster.

Usually there is a modified or untracked file outside of
the sparse-checkout mentioned by the 'git status' output. There are a
number of reasons why this is insufficient:

Fair enough; let's focus on why the output of git status is insufficient...

  1. Users may not have a full understanding of which files are inside or
     outside of their sparse-checkout. This is more common in monorepos
     that manage the sparse-checkout using custom tools that map build
     dependencies into sparse-checkout definitions.

Having sparse-checkout patterns managed by custom tools is a really
good point, but doesn't this statement of yours about needing to know
particular files or directories suggest that...

+static int give_advice_on_expansion = 1;
+#define ADVICE_MSG \
+       "The sparse index is expanding to a full index, a slow operation.\n" \
+       "This likely means that you have files in your working directory\n"  \
+       "that are outside of your sparse-checkout patterns. Remove them\n"   \
+       "to recover performance expectations, such as with 'git clean'."

...this is an insufficient solution?

I was a bit surprised you'd list your first reason for git status
being insufficient, that users need to know which files/directories
are the problem and then provide a solution that doesn't attempt to
identify any files or directories.

  2. In some cases, an empty directory could exist outside the
     sparse-checkout and these empty directories are not reported by 'git
     status' and friends.

This is a really good point too...but given this point, shouldn't your
added advice message also mention "directories" instead of just
mentioning "files" so that users are aware they need to look for those
too?

  3. If the user has '.gitignore' or 'exclude' files, then 'git status'
     will squelch the warnings and not demonstrate any problems.

Your solution does help the user to know that there is a problem (even
if they don't know which files -- or directories -- are the problem),
so this patch is making things better.

Thanks for pushing for a more helpful message. I'm currently thinking
that the core issue is that the working directory has contents (files
or directories) that disagree with the sparse-checkout definition. I
will update the language in v2 to point the user in the direction of
comparing the two.

Thanks,
-Stolee




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux