Re: [PATCH 2/3] setup: add discover_git_directory_reason()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:

> From: Derrick Stolee <derrickstolee@xxxxxxxxxx>
>
> There are many reasons why discovering a Git directory may fail. In
> particular, 8959555cee7 (setup_git_directory(): add an owner check for
> the top-level directory, 2022-03-02) added ownership checks as a
> security precaution.
>
> Callers attempting to set up a Git directory may want to inform the user
> about the reason for the failure. For that, expose the enum
> discovery_result from within setup.c and into cache.h where
> discover_git_directory() is defined.
>
> I initially wanted to change the return type of discover_git_directory()
> to be this enum, but several callers rely upon the "zero means success".
> The two problems with this are:
>
> 1. The zero value of the enum is actually GIT_DIR_NONE, so nonpositive
>    results are errors.

True. discover_git_directory() already knows that negative return
values from setup_git_directory_gently_1() signal errors while 0 or
positive are OK.

> 2. There are multiple successful states, so some positive results are
>    successful.

Makes it sound as if some positive results are not successes, but is
that really the case?

> Instead of updating all callers immediately, add a new method,
> discover_git_directory_reason(), and convert discover_git_directory() to
> be a thin shim on top of it.

It makes sense to insulate callers who only want to know if the
discovery was successful or not (there only are two existing callers
anyway) from the details.  And turning a thin wrapper to the new API
that gives richer return codes is the way to go.  Nicely designed.

> Because there are extra checks that discover_git_directory_reason() does
> after setup_git_directory_gently_1(), there are other modes that can be
> returned for failure states. Add these modes to the enum, but be sure to
> explicitly add them as BUG() states in the switch of
> setup_git_directory_gently().

Good.

> -enum discovery_result {
> -	GIT_DIR_NONE = 0,
> -	GIT_DIR_EXPLICIT,
> -	GIT_DIR_DISCOVERED,
> -	GIT_DIR_BARE,
> -	/* these are errors */
> -	GIT_DIR_HIT_CEILING = -1,
> -	GIT_DIR_HIT_MOUNT_POINT = -2,
> -	GIT_DIR_INVALID_GITFILE = -3,
> -	GIT_DIR_INVALID_OWNERSHIP = -4,
> -	GIT_DIR_DISALLOWED_BARE = -5,
> -};

So we promote this discovery_result, that was private implementation
detail inside the setup code, to a public interface.  Is GIT_DIR_
prefix still appropriate, or would it make more sense to have a
common substring derived from the word DISCOVERY in them?

> @@ -1385,21 +1372,22 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
>  	}
>  }
>  
> -int discover_git_directory(struct strbuf *commondir,
> -			   struct strbuf *gitdir)
> +enum discovery_result discover_git_directory_reason(struct strbuf *commondir,
> +						    struct strbuf *gitdir)
>  {
>  	struct strbuf dir = STRBUF_INIT, err = STRBUF_INIT;
>  	size_t gitdir_offset = gitdir->len, cwd_len;
>  	size_t commondir_offset = commondir->len;
>  	struct repository_format candidate = REPOSITORY_FORMAT_INIT;
> +	enum discovery_result result;
>  
>  	if (strbuf_getcwd(&dir))
> -		return -1;
> +		return GIT_DIR_CWD_FAILURE;

Makes sense.

>  	cwd_len = dir.len;
> -	if (setup_git_directory_gently_1(&dir, gitdir, NULL, 0) <= 0) {
> +	if ((result = setup_git_directory_gently_1(&dir, gitdir, NULL, 0)) <= 0) {

Can we split this into two simple statements?

	result = setup_git_directory_gently_1(...);
	if (result <= 0) {

>  		strbuf_release(&dir);
> -		return -1;
> +		return result;
>  	}

OK.

> @@ -1429,11 +1417,11 @@ int discover_git_directory(struct strbuf *commondir,
>  		strbuf_setlen(commondir, commondir_offset);
>  		strbuf_setlen(gitdir, gitdir_offset);
>  		clear_repository_format(&candidate);
> -		return -1;
> +		return GIT_DIR_INVALID_FORMAT;

OK, so this thing is new.  Earlier we thought we found a good
GIT_DIR but it turns out the repository is something we cannot use.
Over time we may acquire such a "now we found it, is it really good?"
sanity checks, but for now, this is the only case that turns what
gently_1() thought was good into a bad one.  OK.

>  	}
>  
>  	clear_repository_format(&candidate);
> -	return 0;
> +	return result;

And it makes perfect sense that everybody who passed such a "post
discovery check" are OK and we return the result from gently_1().

> @@ -1516,9 +1504,11 @@ const char *setup_git_directory_gently(int *nongit_ok)
>  		*nongit_ok = 1;
>  		break;
>  	case GIT_DIR_NONE:
> +	case GIT_DIR_CWD_FAILURE:
> +	case GIT_DIR_INVALID_FORMAT:
>  		/*
>  		 * As a safeguard against setup_git_directory_gently_1 returning
> -		 * this value, fallthrough to BUG. Otherwise it is possible to
> +		 * these values, fallthrough to BUG. Otherwise it is possible to
>  		 * set startup_info->have_repository to 1 when we did nothing to
>  		 * find a repository.
>  		 */

OK.

Not a new problem, but does anybody explicitly or implicitly return
DIR_NONE?  I didn't find any codepath that does so.  Presumably it
may have been arranged in the hope that a code structured like so:

	enum discovery_result res = GIT_DIR_NONE;

	if (some complex condition)
		res = ...;
	else if (another complex condition)
		res = ...;

	... sometime later ...
	if (res <= 0)
		we found a bad one

would ensure that "res" untouched by setup_git_directory_gently_1()
is still an error, but I am not sure if it is effective, given that
nobody uses GIT_DIR_NONE to assign or initialize anything.  And the
same effect can be had by leaving 'res' uninitialized---the compilers
are our friend.

Not a part of this review, but I wonder if it makes sense for us to
get rid of DIR_NONE.

> -int discover_git_directory(struct strbuf *commondir,
> -			   struct strbuf *gitdir);
> +static inline int discover_git_directory(struct strbuf *commondir,
> +					 struct strbuf *gitdir)
> +{
> +	return discover_git_directory_reason(commondir, gitdir) <= 0;
> +}

The _reason() thing is more or less like setup_git_directory_gently_1()
in that positives are success and everything else is an error.  And
the point of keeping this thin wrapper as discover_git_directory() is
to insulate the existing callers that discover_git_directory() returns
-1 for failure, while returning 0 for success.

So this wrapper smells very wrong.  It now flips the polarity of the
error return into a positive 1, no?  That does not achieve the goal
to insulate the callers from the change in implementation.

Other than that, as you wrote in the cover letter, I think it is an
excellent move to have an interface to expose details of what
discovery has found.

Thanks.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux