Re: [PATCH v6 3/5] config: learn `git_protected_config()`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Taylor Blau <me@xxxxxxxxxxxx> writes:

> On Thu, Jun 30, 2022 at 06:13:57PM +0000, Glen Choo via GitGitGadget wrote:
>> In light of constraint 1, this implementation can still be improved
>> since git_protected_config() iterates through every variable in
>> protected_config, which may still be too expensive. There exist constant
>> time lookup functions for non-protected configuration
>> (repo_config_get_*()), but for simplicity, this commit does not
>> implement similar functions for protected configuration.
>
> I don't quite follow along with this paragraph: it sounds like reading
> protected configuration is supposed to be as fast as possible. But you
> note that only the slower variant of reading each configuration variable
> one at a time is implemented.

Right. I should have been clearer that this implementation is "fast
enough without introducing too much noise/complexity", and not "as fast
as possible".

> If we care about speed (and I think we should here), then would it make
> more sense to implement only the lookup functions like
> repo_config_get_*() for protected context? That would encourage usage by
> providing a more limited set of options to callers.

I held off on implementing these functions because:

- It requires rewriting `safe.directory`, which reads a multivalued
  string using a config iterator. It's not onerous to do (I had a POC
  of this at some point), but it seemed pretty noisy.
- It seems too noisy to implement all of the protected_config_get_*()
  functions, and a little inconsistent to only implement the ones used
  in this series (but maybe a little inconsistency is ok?)

But maybe a little noise and inconsistency is worth the performance
improvement, especially since it's been brought up ~1.5 times before
this [1] [2]. I'll do this for sure if you feel strongly about it,
otherwise I'll just try it out just to see what I think about it.

[1] https://lore.kernel.org/git/802c3541-3301-43fc-c39e-edd44e61a4eb@xxxxxxxxxx
[2] https://lore.kernel.org/git/xmqqbkv4t7gp.fsf@gitster.g

>> Signed-off-by: Glen Choo <chooglen@xxxxxxxxxx>
>> ---
>>  config.c                     | 51 ++++++++++++++++++++++++++++++++++++
>>  config.h                     | 17 ++++++++++++
>>  t/t5544-pack-objects-hook.sh |  7 ++++-
>>  upload-pack.c                | 27 ++++++++++++-------
>>  4 files changed, 91 insertions(+), 11 deletions(-)
>>
>> diff --git a/config.c b/config.c
>> index 9b0e9c93285..29e62f5d0ed 100644
>> --- a/config.c
>> +++ b/config.c
>> @@ -81,6 +81,18 @@ static enum config_scope current_parsing_scope;
>>  static int pack_compression_seen;
>>  static int zlib_compression_seen;
>>
>> +/*
>> + * Config that comes from trusted sources, namely:
>
> Should we be using the word "scope" here instead of sources? I think
> it's clear enough from the context what you're referring to, but in the
> spirit of being consistent...

Good catch.

>> + * - system config files (e.g. /etc/gitconfig)
>> + * - global config files (e.g. $HOME/.gitconfig,
>> + *   $XDG_CONFIG_HOME/git)
>> + * - the command line.
>> + *
>> + * This is declared here for code cleanliness, but unlike the other
>> + * static variables, this does not hold config parser state.
>> + */
>> +static struct config_set protected_config;
>> +
>>  static int config_file_fgetc(struct config_source *conf)
>>  {
>>  	return getc_unlocked(conf->u.file);
>> @@ -2378,6 +2390,11 @@ int git_configset_add_file(struct config_set *cs, const char *filename)
>>  	return git_config_from_file(config_set_callback, filename, cs);
>>  }
>>
>> +int git_configset_add_parameters(struct config_set *cs)
>> +{
>> +	return git_config_from_parameters(config_set_callback, cs);
>> +}
>> +
>>  int git_configset_get_value(struct config_set *cs, const char *key, const char **value)
>>  {
>>  	const struct string_list *values = NULL;
>> @@ -2619,6 +2636,40 @@ int repo_config_get_pathname(struct repository *repo,
>>  	return ret;
>>  }
>>
>> +/* Read values into protected_config. */
>> +static void read_protected_config(void)
>> +{
>> +	char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
>> +
>> +	git_configset_init(&protected_config);
>> +
>> +	system_config = git_system_config();
>> +	git_global_config(&user_config, &xdg_config);
>> +
>> +	git_configset_add_file(&protected_config, system_config);
>> +	git_configset_add_file(&protected_config, xdg_config);
>> +	git_configset_add_file(&protected_config, user_config);
>> +	git_configset_add_parameters(&protected_config);
>> +
>> +	free(system_config);
>> +	free(xdg_config);
>> +	free(user_config);
>> +}
>> +
>> +/* Ensure that protected_config has been initialized. */
>> +static void git_protected_config_check_init(void)
>> +{
>> +	if (protected_config.hash_initialized)
>> +		return;
>> +	read_protected_config();
>> +}
>> +
>> +void git_protected_config(config_fn_t fn, void *data)
>> +{
>> +	git_protected_config_check_init();
>
> This may be copying from an existing pattern, but I think you could
> avoid the extra function declaration by writing git_protected_config()
> as:
>
>     if (!protected_config.hash_initialized)
>         read_protected_config();
>     configset_iter(&protected_config, fn, data);

You're right, I can drop this if I don't implement
protected_config_get_*(); this pattern only makes sense for
git_config_check_init() because it's called by multiple functions.

> Thanks,
> Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux