Re: [WIP v2 2/2] config: include file if remote URL matches a glob

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 29, 2021 at 10:31:10AM -0700, Jonathan Tan wrote:
> 
> This is a feature that supports config file inclusion conditional on
> whether the repo has a remote with a URL that matches a glob.
> 
> Similar to my previous work on remote-suggested hooks [1], the main
> motivation is to allow remote repo administrators to provide recommended
> configs in a way that can be consumed more easily (e.g. through a
> package installable by a package manager).

To expand a little more on this:

At Google we ship /etc/gitconfig, as well as /usr/share/git-core/. Our
/etc/gitconfig looks basically like:

 [include]
   path = /usr/share/git-core/gitconfig
   path = /usr/share/git-core/some-specific-config
   path = /usr/share/git-core/other-specific-config

Jonathan's WIP allows us to append lines to /etc/gitconfig sort of like

 [includeIf "hasRemoteUrl:https://internal-google/big-project";]
   path = /usr/share/big-project/gitconfig

That's approach #1 to shipping a config, which we might use for a
project that makes up a significant portion of our userbase. We ship
(and own) the /etc/gitconfig; BigProject team ships and owns their own
gitconfig; everybody internally who works on BigProject, whether it's
just once to fix a small thing or every day as their main job, gets the
relevant configs for BigProject.

Approach #2 I think is also still a useful one, and maybe more
interesting outside of Google:

When I run 'sudo apt install big-oss-project-devkit', a few things
happen:
1. /usr/share/big-oss-project/gitconfig appears
2. `git config --global \
		'includeIf.hasRemoteUrl:https://github/big-oss-project/*' \
		'/usr/share/big-oss-project/gitconfig'` is run
3. whatever other special tools, scripts, etc. are installed

That way regardless of which project I'm working on -
big-oss-project/translation, big-oss-project/docs,
big-oss-project/big-oss-project - I still get configs and style checkers
and whatever else.

With this approach #2, it's still possible for someone to do a drive-by
contribution without ever running 'apt install big-oss-project-devkit',
so it's not quite as strong a recommendation as the former
"remote-suggested-hooks" topic. User would still want to take a look at
the README for big-oss-project to learn they're supposed to be
installing that package ahead of time. But it's still a oneshot setup
for nice things like partial clone filters, maybe sparsity filters,
maybe config-based hooks, etc., especially if big-oss-project already
was shipping some project-specific tooling (like maybe a special
debugger or a docker image or I don't know).

The nice thing about 'hasRemoteUrl' in this case is that we don't need
to know the location of the user's big-oss-project/ checkout on disk. We
can set that config globally and they can checkout big-oss-project as
many times and as many places as they wish. It wouldn't be possible to
ship configs via a package manager or other automated script without it.

> 
> NEEDSWORK: The way this works is that if we see such an include, we
> shunt all subsequent configs into a stash (while looking for URLs), then
> process the stash. In particular, this means that more memory is needed,
> and the nature of error reporting changes (currently, if a callback
> returns nonzero for a variable, processing halts immediately, but with
> this patch, all the config might be read from disk before the callback
> even sees the variable). I'll need to expand on this and write a
> documentation section.

Hm. I'm not so sure about making another structure for storing config
into memory, because we already do that during the regular config parse
(to make things like git_config_get_string() fast). Can we not re-walk
the in-memory config at the end of the normal parse, rather than reading
from disk twice?

I think git_config()/repo_config() callback even does that for you for free...?

2304 void repo_config(struct repository *repo, config_fn_t fn, void
*data)
2305 {
2306         git_config_check_init(repo);
2307         configset_iter(repo->config, fn, data);
2308 }

> 
> One alternative is to rerun the config parsing mechanism upon noticing
> the first URL-conditional include in order to find all URLs. This would
> require the config files to be read from disk twice, though.

What's the easiest way to "try it and see", to add tooling and find out
whether the config files would be reopened during the second parse?
Because I suspect that we won't actually reopen those files, due to the
config cache.

So couldn't we do something like....

pass #1:
 if (include)
   if (not hasRemoteUrl)
     open up path & parse
 put config into in-memory cache normally
pass #2: (and this pass would need to be added to repo_config() probably)
 if (include)
   if (hasRemoteUrl)
     open up path & parse
     insert in-order into in-memory cache
 don't touch existing configs otherwise

I think it's in practice similar to the approach you're using (getting
around the weird ordering with a cache in memory), but we could reuse
the existing config cache rather than creating a new and different one.

 - Emily



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux