Re: [PATCH 2/2] sparse-checkout: clear tracked sparse dirs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 4, 2021 at 7:55 PM Derrick Stolee <stolee@xxxxxxxxx> wrote:
>
> On 8/2/2021 12:17 PM, Elijah Newren wrote:
> > On Mon, Aug 2, 2021 at 8:34 AM Derrick Stolee <stolee@xxxxxxxxx> wrote:
> >>
> >> On 7/30/2021 9:52 AM, Elijah Newren wrote:
> >>> On Thu, Jul 29, 2021 at 11:27 AM Derrick Stolee via GitGitGadget
> >>> <gitgitgadget@xxxxxxxxx> wrote:
> > ...
> >>>> +                */
> >>>> +               if (S_ISSPARSEDIR(ce->ce_mode) &&
> >>>> +                   repo_file_exists(r, ce->name)) {
> >>>> +                       strbuf_setlen(&path, pathlen);
> >>>> +                       strbuf_addstr(&path, ce->name);
> >>>> +
> >>>> +                       /*
> >>>> +                        * Removal is "best effort". If something blocks
> >>>> +                        * the deletion, then continue with a warning.
> >>>> +                        */
> >>>> +                       if (remove_dir_recursively(&path, 0))
> >>>> +                               warning(_("failed to remove directory '%s'"), path.buf);
> >>>
> >>> Um, doesn't this delete untracked files that are not ignored as well
> >>> as the ignored files?  If so, was that intentional?  I'm fully on
> >>> board with removing the gitignore'd files, but I'm worried removing
> >>> other untracked files is dangerous.
> >>
> >> I believe that 'git sparse-checkout (set|add|reapply)' will fail before
> >> reaching this method if there are untracked files that could potentially
> >> be removed. I will double-check to ensure this is the case. It is
> >> definitely my intention to protect any untracked, non-ignored files in
> >> these directories by failing the sparse-checkout modification.
>
> This is _not_ true, and I can document it with a test.
>
> Having untracked files outside of the sparse cone is just as bad as
> ignored files, so I want to ensure that these get cleaned up, too.
>
> The correct thing would be to prevent the 'git sparse-checkout
> (set|add|reapply)' command from making any changes to the sparse-checkout
> cone or the worktree if there are untracked files that would be deleted.
> (Right? Or is there another solution that I'm missing here?)

We could sparsify as much as possible and print warnings, much like we
do with tracked files that are modified but not staged.  In fact, it
might feel inconsistent if we sparsify as much as possible for one
type of file, and abort if we cannot completely sparsify for a
different type of file.  We could consider changing how we treat
tracked files that are modified but not staged and have them abort the
sparse-checkout commands as well, but I worry that might cause
problems during conflict resolution in the middle of
merges/rebases/cherry-picks/reverts.  I don't want users caught where
they need to update their sparsity paths to gain new files/directories
that will help them resolve some conflicts, but be unable to update
their sparsity paths because they have conflicts.

That said, the basic idea of aborting sparse-checkout in cone mode
when there are untracked unignored files in the way of removing
directories sounds reasonable, if there's some clever way to avoid or
ameliorate the inconsistency issues mentioned above.  Implementing it
might require walking all untracked (and tracked?) files twice,
though, because if there are untracked unignored files in the way, we
probably don't want to abort after first deleting lots of ignored
files.  (And there's a small race window in the double walk...)
However, I don't expect people to run sparse-checkout commands all
that often, so the double walk is probably a perfectly reasonable
performance cost.  I just wanted to note it.

> >>> My implementation of this concept (in an external tool) was more along
> >>> the lines of
> >>>
> >>>   * Get $LIST_OF_NON_SPARSE_DIRECTORIES by walking `git ls-files -t`
> >>> output and finding common fully-sparse directories
> >>>   * git clean -fX $LIST_OF_NON_SPARSE_DIRECTORIES
> >>
> >> I initially was running 'git clean -dfx -- <dir> ...' but that also
> >> requires parsing and expanding the index (or being very careful with
> >> the sparse index).
> >
> > `git clean -dfx -- <dir> ...` could also be very dangerous because
> > it'd delete untracked non-ignored files.  You want -X rather than -x.
> > One of those cases where capitalization is critical.
>
> Good point. I'd like to avoid using `git clean` as a subcommand, if
> possible, that way we have one fewer thing to do before integrating
> the `git sparse-checkout` builtin with the sparse index.

Oh, I didn't want to invoke a subcommand, I was just pointing out
where similar code might be found in case we wanted to call the same
functions from elsewhere (or maybe even turn some of it into library
functions we could call).  But that might be a moot point if we end up
making sparse-checkout fail if there are untracked unignored files
hanging around in the relevant directories.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux