Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Peff,

On Tue, 6 Dec 2016, Jeff King wrote:

> On Tue, Dec 06, 2016 at 02:16:35PM +0100, Johannes Schindelin wrote:
> 
> > > Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes:
> > > 
> > > > Seriously, do you really think it is a good idea to have
> > > > git_config_get_value() *ignore* any value in .git/config?
> > > 
> > > When we do not know ".git/" directory we see is the repository we
> > > were told to work on, then we should ignore ".git/config" file.  So
> > > allowing git_config_get_value() to _ignore_ ".git/config" before the
> > > program calls setup_git_directory() does have its uses.
> > 
> > I think you are wrong. This is yet another inconsistent behavior that
> > violates the Law of Least Surprises.
> 
> There are surprises in this code any way you turn.  If we have not
> called setup_git_directory(), then how does git_config_get_value() know
> if we are in a repository or not?

My biggest surprise, frankly, would be that `git init` and `git clone` are
not special-cased.

> Should it blindly look at ".git/config"?

Absolutely not, of course. You did not need me to say that.

> Now your program behaves differently depending on whether you are in the
> top-level of the working tree.

Exactly. This, BTW, is already how the code would behave if anybody called
`git_path()` before `setup_git_directory()`, as the former function
implicitly calls `setup_git_env()` which does *not* call
`setup_git_directory()` but *does* set up `git_dir` which is then used by
`do_git_config_sequence()`..

We have a few of these nasty surprises in our code base, where code
silently assumes that global state is set up correctly, and succeeds in
sometimes surprising ways if it is not.

> Should it speculatively do repo discovery, and use any discovered config
> file?

Personally, I find the way we discover the repository most irritating. It
seems that we have multiple, mutually incompatible code paths
(`setup_git_directory()` and `setup_git_env()` come to mind already, and
it does not help that consecutive calls to `setup_git_directory()` will
yield a very unexpected outcome).

Just try to explain to a veteran software engineer why you cannot call
`setup_git_directory_gently()` multiple times and expect the very same
return value every time.

Another irritation is that some commands that clearly would like to use a
repository if there is one (such as `git diff`) are *not* marked with
RUN_SETUP_GENTLY, due to these unfortunate implementation details.

> Now some commands respect config that they shouldn't (e.g., running "git
> init foo.git" from inside another repository will accidentally pick up
> the value of core.sharedrepository from wherever you happen to run it).

Right. That points to another problem with the way we do things: we leak
global state from discovering a git_dir, which means that we can neither
undo nor override it.

If we discovered our git_dir in a robust manner, `git init` could say:
hey, this git_dir is actually not what I wanted, here is what I want.

Likewise, `git submodule` would eventually be able to run in the very same
process as the calling `git`, as would a local fetch.

> So I think the caller of the config code has to provide some kind of
> context about how it is expecting to run and how the value will be used.

Yep.

Maybe even go a step further and say that the config code needs a context
"object".

> Right now if setup_git_directory() or similar hasn't been called, the
> config code does not look.

Correct.

Actually, half correct. If setup_git_directory() has not been called, but,
say, git_path() (and thereby implicitly setup_git_env()), the config code
*does* look. At a hard-coded .git/config.

> Ideally there would be a way for a caller to say "I am running early and
> not even sure yet if we are in a repo; please speculatively try to find
> repo config for me".

And ideally, it would not roll *yet another* way to discover the git_dir,
but it would reuse the same function (fixing it not to chdir()
unilaterally).

Of course, not using `chdir()` means that we have to figure out symbolic
links somehow, in case somebody works from a symlinked subdirectory, e.g.:

	ln -s $PWD/t/ ~/test-directory
	cd ~/test-directory
	git log

> The pager code does this manually, and without great accuracy; see the
> hack in pager.c's read_early_config().

I saw it. And that is what triggered the mail you are responding to (you
probably saw my eye-rolling between the lines).

The real question is: can we fix this? Or is there simply too great
reluctance to change the current code?

> I think the way forward is:
> 
>   1. Make that an optional behavior in git_config_with_options() so
>      other spots can reuse it (probably alias lookup, and something like
>      your difftool config).

Ideally: *any* early call to `git_config_get_value()`. Make things less
surprising.

>   2. Make it more accurate. Right now it blindly looks in .git/config,
>      but it should be able to do the usual repo-detection (_without_
>      actually entering the repo) to try to find a possible config file.

The real trick will be to convince Junio to have a single function for
git_dir discovery, I guess, lest we have multiple, slightly incompatible
ways to discover it. I expect a lot of resistance here, because we would
have to change tried-and-tested (if POLA-violating) code.

Ciao,
Dscho



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]