Hi Peff, On Tue, 6 Dec 2016, Jeff King wrote: > On Tue, Dec 06, 2016 at 02:16:35PM +0100, Johannes Schindelin wrote: > > > > Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes: > > > > > > > Seriously, do you really think it is a good idea to have > > > > git_config_get_value() *ignore* any value in .git/config? > > > > > > When we do not know ".git/" directory we see is the repository we > > > were told to work on, then we should ignore ".git/config" file. So > > > allowing git_config_get_value() to _ignore_ ".git/config" before the > > > program calls setup_git_directory() does have its uses. > > > > I think you are wrong. This is yet another inconsistent behavior that > > violates the Law of Least Surprises. > > There are surprises in this code any way you turn. If we have not > called setup_git_directory(), then how does git_config_get_value() know > if we are in a repository or not? My biggest surprise, frankly, would be that `git init` and `git clone` are not special-cased. > Should it blindly look at ".git/config"? Absolutely not, of course. You did not need me to say that. > Now your program behaves differently depending on whether you are in the > top-level of the working tree. Exactly. This, BTW, is already how the code would behave if anybody called `git_path()` before `setup_git_directory()`, as the former function implicitly calls `setup_git_env()` which does *not* call `setup_git_directory()` but *does* set up `git_dir` which is then used by `do_git_config_sequence()`.. We have a few of these nasty surprises in our code base, where code silently assumes that global state is set up correctly, and succeeds in sometimes surprising ways if it is not. > Should it speculatively do repo discovery, and use any discovered config > file? Personally, I find the way we discover the repository most irritating. It seems that we have multiple, mutually incompatible code paths (`setup_git_directory()` and `setup_git_env()` come to mind already, and it does not help that consecutive calls to `setup_git_directory()` will yield a very unexpected outcome). Just try to explain to a veteran software engineer why you cannot call `setup_git_directory_gently()` multiple times and expect the very same return value every time. Another irritation is that some commands that clearly would like to use a repository if there is one (such as `git diff`) are *not* marked with RUN_SETUP_GENTLY, due to these unfortunate implementation details. > Now some commands respect config that they shouldn't (e.g., running "git > init foo.git" from inside another repository will accidentally pick up > the value of core.sharedrepository from wherever you happen to run it). Right. That points to another problem with the way we do things: we leak global state from discovering a git_dir, which means that we can neither undo nor override it. If we discovered our git_dir in a robust manner, `git init` could say: hey, this git_dir is actually not what I wanted, here is what I want. Likewise, `git submodule` would eventually be able to run in the very same process as the calling `git`, as would a local fetch. > So I think the caller of the config code has to provide some kind of > context about how it is expecting to run and how the value will be used. Yep. Maybe even go a step further and say that the config code needs a context "object". > Right now if setup_git_directory() or similar hasn't been called, the > config code does not look. Correct. Actually, half correct. If setup_git_directory() has not been called, but, say, git_path() (and thereby implicitly setup_git_env()), the config code *does* look. At a hard-coded .git/config. > Ideally there would be a way for a caller to say "I am running early and > not even sure yet if we are in a repo; please speculatively try to find > repo config for me". And ideally, it would not roll *yet another* way to discover the git_dir, but it would reuse the same function (fixing it not to chdir() unilaterally). Of course, not using `chdir()` means that we have to figure out symbolic links somehow, in case somebody works from a symlinked subdirectory, e.g.: ln -s $PWD/t/ ~/test-directory cd ~/test-directory git log > The pager code does this manually, and without great accuracy; see the > hack in pager.c's read_early_config(). I saw it. And that is what triggered the mail you are responding to (you probably saw my eye-rolling between the lines). The real question is: can we fix this? Or is there simply too great reluctance to change the current code? > I think the way forward is: > > 1. Make that an optional behavior in git_config_with_options() so > other spots can reuse it (probably alias lookup, and something like > your difftool config). Ideally: *any* early call to `git_config_get_value()`. Make things less surprising. > 2. Make it more accurate. Right now it blindly looks in .git/config, > but it should be able to do the usual repo-detection (_without_ > actually entering the repo) to try to find a possible config file. The real trick will be to convince Junio to have a single function for git_dir discovery, I guess, lest we have multiple, slightly incompatible ways to discover it. I expect a lot of resistance here, because we would have to change tried-and-tested (if POLA-violating) code. Ciao, Dscho