Re: [RFC v2 PATCH] Teach rm to remove submodules unless they contain a git directory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jens Lehmann <Jens.Lehmann@xxxxxx> writes:

> diff --git a/Documentation/git-rm.txt b/Documentation/git-rm.txt
> index 5d31860..3c76f9c 100644
> --- a/Documentation/git-rm.txt
> +++ b/Documentation/git-rm.txt
> @@ -107,6 +107,21 @@ as well as modifications of existing paths.
>  Typically you would first remove all tracked files from the working
>  tree using this command:
>
> +Submodules
> +~~~~~~~~~~~~~~~~~~~~

You need to match the underline to the text if you want to make this
a heading.

> diff --git a/builtin/rm.c b/builtin/rm.c
> index 90c8a50..cb927a8 100644
> --- a/builtin/rm.c
> +++ b/builtin/rm.c
> @@ -9,6 +9,7 @@
>  #include "cache-tree.h"
>  #include "tree-walk.h"
>  #include "parse-options.h"
> +#include "submodule.h"
>
>  static const char * const builtin_rm_usage[] = {
>  	"git rm [options] [--] <file>...",
> @@ -17,9 +18,43 @@ static const char * const builtin_rm_usage[] = {
>
>  static struct {
>  	int nr, alloc;
> -	const char **name;
> +	struct {
> +		const char *name;
> +		char is_submodule;
> +	} *entry;
>  } list;
>
> +static int check_submodules_use_gitfiles()

static int check_submodules_use_gitfiles(void)

> +{
> +	int i;
> +	int errs = 0;
> +
> +	for (i = 0; i < list.nr; i++) {
> +		const char *name = list.entry[i].name;
> +		int pos;
> +		struct cache_entry *ce;
> +		struct stat st;
> +
> +		pos = cache_name_pos(name, strlen(name));
> +		if (pos < 0)
> +			continue; /* ignore unmerged entry */

Would this cause "git rm -f path" for an unmerged submodule bypass
the safety check?

With or without this patch, check_local_mod() will allow you to
remove unmerged entry and the file in the working tree, and that is
perfectly fine for a regular file or a symlink (as the path is
involved in a conflicted merge (or other mergy operation), and its
change from the HEAD can only come from that merge, because we would
not let merge touch a path and leave its index entry unmerged if the
path has local changes in the first place).  Resolving the merge as
a removal at the index level for a submodule is also fine in such a
case, but don't you want to still keep the submodule working tree if
it has its sole copy of the repository?  And as far as I can tell,
this function is the only thing that protects the user in such a
situation.

> +		ce = active_cache[pos];
> +
> +		if (!S_ISGITLINK(ce->ce_mode) ||
> +		    (lstat(ce->name, &st) < 0) ||
> +		    is_empty_dir(name))
> +			continue;
> +
> +		if (!submodule_uses_gitfile(name))
> +			errs = error(_("submodule '%s' (or one of its nested "
> +				     "submodules) uses a .git directory\n"
> +				     "(use 'rm -rf' if you really want to remove "
> +				     "it including all of its history)"), name);
> +	}
> +
> +	return errs;
> +}

The call to this function comes at the very end and gives us yes/no
for the entire set of paths.  After getting this error for one
submodule and bunch of other non-submodule paths, what is the
procedure for the user to remove it that we want to recommend in our
documentation?  Would it go like this?

	$ git rm path1 path2 sub path3
	... get the above error ...
	$ git submodule --to-gitfile sub
        $ rm -fr sub
        $ git rm sub
        ... then finally ...
        $ git rm path1 path2 path3

This is not a complaint about the error message above, by the way.

> @@ -37,7 +72,7 @@ static int check_local_mod(unsigned char *head, int index_only)
>  		struct stat st;
>  		int pos;
>  		struct cache_entry *ce;
> -		const char *name = list.name[i];
> +		const char *name = list.entry[i].name;
>  		unsigned char sha1[20];
>  		unsigned mode;
>  		int local_changes = 0;
> @@ -58,9 +93,10 @@ static int check_local_mod(unsigned char *head, int index_only)
>  			/* if a file was removed and it is now a
>  			 * directory, that is the same as ENOENT as
>  			 * far as git is concerned; we do not track
> -			 * directories.
> +			 * directories unless they are submodules.
>  			 */
> -			continue;
> +			if (!S_ISGITLINK(ce->ce_mode))
> +				continue;
>  		}
>
>  		/*
> @@ -80,8 +116,11 @@ static int check_local_mod(unsigned char *head, int index_only)
>
>  		/*
>  		 * Is the index different from the file in the work tree?
> +		 * If it's a submodule, is its work tree modified?
>  		 */
> -		if (ce_match_stat(ce, &st, 0))
> +		if (ce_match_stat(ce, &st, 0) ||
> +		    (S_ISGITLINK(ce->ce_mode) &&
> +		     !ok_to_remove_submodule(ce->name)))
>  			local_changes = 1;

As noted before, because we also skip these "does it match the
index?  does it match the HEAD?" checks for unmerged paths in this
function, a submodule that has local changes or new files is
eligible for removal during a conflicted merge.  I have a feeling
that this should be tightened a bit; wouldn't we want to check at
least in the checked out version (i.e. stage #2 in the index) if the
path were a submodule, even if we are in the middle of a conflicted
merge?  After all, the top level merge shouldn't have touched the
submodule working tree, so the local modes and new files must have
come from the end user action that was done _before_ the conflicted
merge started, and not expendable, no?



-- >8 --

 Documentation/git-rm.txt | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git i/Documentation/git-rm.txt w/Documentation/git-rm.txt
index 3c76f9c..882cb11 100644
--- i/Documentation/git-rm.txt
+++ w/Documentation/git-rm.txt
@@ -108,13 +108,13 @@ Typically you would first remove all tracked files from the working
 tree using this command:
 
 Submodules
-~~~~~~~~~~~~~~~~~~~~
+~~~~~~~~~~
 Only submodules using a gitfile (which means they were cloned
 with a git version 1.7.8 or newer) will be removed from the work
 tree, as their repository lives inside the .git directory of the
 superproject. If a submodule (or one of those nested inside it)
-still use a .git directory, `git rm` will fail - no matter if forced
-or not - to protect the submodules history.
+still uses a .git directory, `git rm` will fail - no matter if forced
+or not - to protect the submodule's history.
 
 A submodule is considered up-to-date when the HEAD is the same as
 recorded in the index, no tracked files are modified and no untracked
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]