Re: [PATCH] refs.c: get_ref_cache: use a bucket hash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King <peff@xxxxxxxx> writes:

> The get_ref_cache code was designed to scale to the actual number of
> submodules. I do not mind seeing it become a hash if people really do
> have a large number of submodules, but that is not what is happening
> here.
> ...
> So git-clean speculatively asks "what is HEAD in this maybe-submodule?". The
> right solution is probably one of:
>
>   1. In remove_dirs, find out if we have an actual submodule before calling
>      resolve_gitlink_ref.
>
>   2. Teach get_ref_cache a "read-only" mode that will not auto-vivify the cache
>      if it does not already exist.
>
> Of the two, I think (1) is probably cleaner (I think the way the ref
> code is structured, we have to create the submodule ref_cache in order
> to start looking things up in it).

Thanks for a great analysis.  I too wondered if we should be growing
the per-submodule ref-cache when we are only probing.

> It looks like we don't even really care about the value of HEAD. We just
> want to know "is it a git directory?". I think in other places (like
> "git add"), we just do an existence check for "$dir/.git". That would
> not catch a bare repository, but I do not think the current check does
> either (it is looking for submodules, which always have a .git).

If we wanted to be consistent, perhaps we should be reusing the "is
this a git repository?" check used by the auto-discovery codepath
(setup.c:is_git_directory(), perhaps?), but the idea looks simple
enough and sounds sensible.

> Maybe something like (largely untested):
>
> diff --git a/builtin/clean.c b/builtin/clean.c
> index 98c103f..e2cc47b 100644
> --- a/builtin/clean.c
> +++ b/builtin/clean.c
> @@ -148,6 +148,32 @@ static int exclude_cb(const struct option *opt, const char *arg, int unset)
>  	return 0;
>  }
>  
> +static int dir_is_repo(struct strbuf *path)
> +{
> +	size_t orig = path->len;
> +	int ret;
> +
> +	strbuf_addstr(path, "/.git");
> +	if (!access(path->buf, F_OK))
> +		ret = 1; /* definitely */
> +	else if (errno == ENOENT)
> +		ret = 0; /* definitely not */
> +	else {
> +		/*
> +		 * We couldn't tell. It would probably be safer to err
> +		 * on the side of saying "yes" here, because we are
> +		 * deciding what to delete, and are more likely to keep
> +		 * a sub-repo. But it would probably also create annoying
> +		 * false positives, where a directory we do not have
> +		 * permission to read would say something misleading
> +		 * like "not deleting sub-repo foo..."
> +		 */
> +		ret = 0;
> +	}
> +	strbuf_setlen(path, orig);
> +	return ret;
> +}
> +
>  static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
>  		int dry_run, int quiet, int *dir_gone)
>  {
> @@ -155,13 +181,11 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag,
>  	struct strbuf quoted = STRBUF_INIT;
>  	struct dirent *e;
>  	int res = 0, ret = 0, gone = 1, original_len = path->len, len;
> -	unsigned char submodule_head[20];
>  	struct string_list dels = STRING_LIST_INIT_DUP;
>  
>  	*dir_gone = 1;
>  
> -	if ((force_flag & REMOVE_DIR_KEEP_NESTED_GIT) &&
> -			!resolve_gitlink_ref(path->buf, "HEAD", submodule_head)) {
> +	if ((force_flag & REMOVE_DIR_KEEP_NESTED_GIT) && dir_is_repo(path)) {
>  		if (!quiet) {
>  			quote_path_relative(path->buf, prefix, &quoted);
>  			printf(dry_run ?  _(msg_would_skip_git_dir) : _(msg_skip_git_dir),
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]