Re: [PATCH] advice: warn when sparse index expands

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Wed, Jul 3, 2024 at 8:14 AM Derrick Stolee via GitGitGadget
<gitgitgadget@xxxxxxxxx> wrote:
>
> From: Derrick Stolee <stolee@xxxxxxxxx>
>
> Typically, forcing a sparse index to expand to a full index means that
> Git could not determine the status of a file outside of the
> sparse-checkout and needed to expand sparse trees into the full list of
> sparse blobs. This operation can be very slow when the sparse-checkout
> is much smaller than the full tree at HEAD.

Yep, I'm with you here.

> When users are in this state, it is common that 'git status' will report
> the problem.

I struggled to understand this sentence in combination with your later
statements, though that may only be because I had some difficulty with
later parts of the commit message.  Perhaps addressing the later parts
will make this sentence fine as-is, but it's possible this sentence
could do with a bit more detail.

> Usually there is a modified or untracked file outside of
> the sparse-checkout mentioned by the 'git status' output. There are a
> number of reasons why this is insufficient:

Fair enough; let's focus on why the output of git status is insufficient...

>  1. Users may not have a full understanding of which files are inside or
>     outside of their sparse-checkout. This is more common in monorepos
>     that manage the sparse-checkout using custom tools that map build
>     dependencies into sparse-checkout definitions.

Having sparse-checkout patterns managed by custom tools is a really
good point, but doesn't this statement of yours about needing to know
particular files or directories suggest that...

> diff --git a/sparse-index.c b/sparse-index.c
> index e48e40cae71..1e517f696dd 100644
> --- a/sparse-index.c
> +++ b/sparse-index.c
> @@ -12,6 +12,21 @@
>  #include "config.h"
>  #include "dir.h"
>  #include "fsmonitor-ll.h"
> +#include "advice.h"
> +
> +/**
> + * This global is used by expand_index() to determine if we should give the
> + * advice for advice.sparseIndexExpanded when expanding a sparse index to a full
> + * one. However, this is sometimes done on purpose, such as in the sparse-checkout
> + * builtin, even when index.sparse=false. This may be disabled in
> + * convert_to_sparse().
> + */
> +static int give_advice_on_expansion = 1;
> +#define ADVICE_MSG \
> +       "The sparse index is expanding to a full index, a slow operation.\n" \
> +       "This likely means that you have files in your working directory\n"  \
> +       "that are outside of your sparse-checkout patterns. Remove them\n"   \
> +       "to recover performance expectations, such as with 'git clean'."

...this is an insufficient solution?

I was a bit surprised you'd list your first reason for git status
being insufficient, that users need to know which files/directories
are the problem and then provide a solution that doesn't attempt to
identify any files or directories.

>  2. In some cases, an empty directory could exist outside the
>     sparse-checkout and these empty directories are not reported by 'git
>     status' and friends.

This is a really good point too...but given this point, shouldn't your
added advice message also mention "directories" instead of just
mentioning "files" so that users are aware they need to look for those
too?

>  3. If the user has '.gitignore' or 'exclude' files, then 'git status'
>     will squelch the warnings and not demonstrate any problems.

Your solution does help the user to know that there is a problem (even
if they don't know which files -- or directories -- are the problem),
so this patch is making things better.

Two other small comments...

> diff --git a/advice.c b/advice.c
> index 558a46fc0b3..7845e427c89 100644
> --- a/advice.c
> +++ b/advice.c
> @@ -77,6 +77,7 @@ static struct {
>         [ADVICE_RM_HINTS]                               = { "rmHints" },
>         [ADVICE_SEQUENCER_IN_USE]                       = { "sequencerInUse" },
>         [ADVICE_SET_UPSTREAM_FAILURE]                   = { "setUpstreamFailure" },
> +       [ADVICE_SPARSE_INDEX_EXPANDED]                  = { "sparseIndexExpanded" },
>         [ADVICE_SKIPPED_CHERRY_PICKS]                   = { "skippedCherryPicks" },

The rest of the list is in alphabetical order, so the new entry
probably should be too.

>         [ADVICE_STATUS_AHEAD_BEHIND_WARNING]            = { "statusAheadBehindWarning" },
>         [ADVICE_STATUS_HINTS]                           = { "statusHints" },
> diff --git a/advice.h b/advice.h
> index 5105d90129d..572272fa0da 100644
> --- a/advice.h
> +++ b/advice.h
> @@ -44,6 +44,7 @@ enum advice_type {
>         ADVICE_RM_HINTS,
>         ADVICE_SEQUENCER_IN_USE,
>         ADVICE_SET_UPSTREAM_FAILURE,
> +       ADVICE_SPARSE_INDEX_EXPANDED,
>         ADVICE_SKIPPED_CHERRY_PICKS,

Again, the rest of the entries are in alphabetical order.


Anyway, I've nit-picked on the little things I saw, but overall this
is a good patch that moves things in a good direction; with a few
small touch-ups it'll probably be good to go.





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux