Re: [PATCH v3 5/7] refresh_index(): add REFRESH_DONT_MARK_SPARSE_MATCHES flag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 19, 2021 at 1:05 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> Matheus Tavares Bernardino <matheus.bernardino@xxxxxx> writes:
>
> >> In other words, the change makes me wonder why we are not adding a
> >> flag that says "do we or do we not want to match paths outside the
> >> sparse checkout cone?", with which the seen[] would automatically
> >> record the right thing.
> >
> > Yeah, makes sense. I didn't want to make the flag skip the sparse
> > paths unconditionally (i.e. without matching them) because then we
> > would also skip the ce_stage() checkings below and the later
> > ce_mark_uptodate(). And I wasn't sure whether this could cause any
> > unwanted side effects.
> >
> > But thinking more carefully about this now, unmerged paths should
> > never have the SKIP_WORKTREE bit set anyway, right? What about the
> > CE_UPTODATE mark, would it be safe to skip it? I'm not very familiar
> > with this code, but I'll try to investigate more later.

Sorry I haven't given any update on this yet. From what I could see so
far, it seems OK to ignore the skip_worktree entries in
refresh_index() when it is called from `git add --refresh`. But
because we would no longer mark the skip_worktree entries that match
the pathspec with CE_UPTODATE, do_write_index() would start checking
if they are racy clean (this is only done when `!ce_uptodate(ce)`),
which could add some lstat() overhead.

However, this made me think what happens today if we do have a racy
clean entry with the skip_worktree bit set... `git add --refresh` will
probably update the index without noticing that the entry is racy
clean (because it won't check CE_UPTODATE entries, and skip_worktree
entries will have this bit set in refresh_index()). Thus the entries'
size won't be truncated to zero when writing the index, and the entry
will appear unchanged even if we later unset the skip_worktree bit.

But can we have a "racy clean skip_worktree entry"? Yes, this seems
possible e.g. if the following sequence happens fast enough for mtime
to be the same before and after the update:

  echo x >file
  git update-index --refresh --skip-worktree file
  echo y>file

Here is a more complete example which artificially creates a "racy
clean skip_worktree entry", runs `git add --refresh`, and shows that
the racy clean entry was not detected:

# Setup
echo sparse >sparse
echo dense >dense
git add .
git commit -m files

# Emulate a racy clean situation
touch -d yesterday date
touch -r date sparse
git update-index --refresh --skip-worktree sparse
touch -r date .git/index
echo xparse >sparse
touch -r date sparse

# `git add` will now write a new index without checking if
# `sparse` is racy clean nor truncating its size
touch -r date dense
git add --refresh .

git update-index --no-skip-worktree sparse
git status
<doesn't show that `sparse` was modified>

This situation feels rather uncommon, but perhaps the same could
happen with `git sparse-checkout set` instead of `git update-index
--refresh --skip-worktree`? IDK. This made me think whether
refresh_index() should really mark skip_worktree entries with
CE_UPTODATE, in the first place.

Any thoughts?



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux