Re: git stash takes excessively long when many untracked files present

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <gitster@xxxxxxxxx> writes:

> In any case, this is a regression introduced in 'master' since the
> last release, and the attempted fix was for an issue that has long
> been with us, so I'll revert a7365313 (git stash: avoid data loss
> when "git stash save" kills a directory, 2013-06-28) soon.  For
> today's -rc3, I'm already deep into the integration cycle, so it is
> too late to do the revert it and then redo everything.
>
> Then we will plan to re-apply the patch once "ls-files --killed"
> gets fixed not to waste too much cycles needlessly, after the coming
> release.

I've already reverted the problematic patch to "git stash" and it
will not be part of the upcoming release.

Here is a quick attempt to see if we can do better in "ls-files -k".

We have an existing test t3010.3 that tries all the combinations of
directory turning into a regular file, symlink, etc. and vice versa,
and it seems to pass.  The test has a directory path6 in the working
tree without any paths in it in the index, and the added bypass code
seems to correctly trigger and prevents us from digging into that
directory, so this patch may be sufficient to improve "ls-files -k".

By the way, regarding the reverted commit, I do not think it is
enough to ask "ls-files -k" to see if the state recorded in the
current index is sufficient.  Imagine your HEAD records "path" as a
file and then you did this:

    $ git reset --hard ;# "path" is now a regular file
    $ mv path path.bak
    $ mkdir path
    $ mv path.bak path/file
    $ git add -A ;# "path/file" in the index and in the working tree
    $ >path/cruft ;# "path/cruft" in the working tree

Then call "save_stash" without saving untracked.  The resulting
stash will save the contents of "path/file" but "path/cruft" is not
recorded anywhere, and then we would need to bring the state in the
working tree and the index back to the state recorded in HEAD, hence
"path" needs to be turned back to a directory.

But "ls-files -k" is asked to check with the index, which has the
path as a directory, so this case is missed.

So instead of

	test -n "$(git ls-files --killed | head -n 1)"

in Pasky's patch, which probably is a right thing to do if you are
running "git stash save --keep-index", you would need something like
this if you are not running with "--keep-index":

	test -n "$(
        	GIT_INDEX_FILE=tmp_index
                export GIT_INDEX_FILE
                git read-tree HEAD
                git ls-files -k
	)"

in order to make sure that the result of going back to the state in
the HEAD will not clobber leftover "path/cruft".

 builtin/ls-files.c | 2 ++
 dir.c              | 9 +++++++++
 dir.h              | 3 ++-
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 5cf3e31..8500446 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -219,6 +219,8 @@ static void show_files(struct dir_struct *dir)
 
 	/* For cached/deleted files we don't need to even do the readdir */
 	if (show_others || show_killed) {
+		if (!show_others)
+			dir->flags |= DIR_COLLECT_KILLED_ONLY;
 		fill_directory(dir, pathspec);
 		if (show_others)
 			show_other_files(dir);
diff --git a/dir.c b/dir.c
index 910bfcd..02939e2 100644
--- a/dir.c
+++ b/dir.c
@@ -1183,6 +1183,15 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 	    cache_name_exists(path->buf, path->len, ignore_case))
 		return path_none;
 
+	/*
+	 * A directory can only contain killed files if the index
+	 * has a path that wants it to be a non-directory.
+	 */
+	if ((dir->flags & DIR_COLLECT_KILLED_ONLY) &&
+	    (dtype == DT_DIR) &&
+	    !cache_name_exists(path->buf, path->len, ignore_case))
+		return path_none;
+
 	exclude = is_excluded(dir, path->buf, &dtype);
 
 	/*
diff --git a/dir.h b/dir.h
index 3d6b80c..4677b86 100644
--- a/dir.h
+++ b/dir.h
@@ -80,7 +80,8 @@ struct dir_struct {
 		DIR_HIDE_EMPTY_DIRECTORIES = 1<<2,
 		DIR_NO_GITLINKS = 1<<3,
 		DIR_COLLECT_IGNORED = 1<<4,
-		DIR_SHOW_IGNORED_TOO = 1<<5
+		DIR_SHOW_IGNORED_TOO = 1<<5,
+		DIR_COLLECT_KILLED_ONLY = 1<<6
 	} flags;
 	struct dir_entry **entries;
 	struct dir_entry **ignored;
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]