Re: [RFC/PATCH v2 1/2] config: Add hashtable for config parsing & retrieval

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 02, 2014 at 07:47:39AM -0700, Tanay Abhra wrote:

> +static int config_cache_entry_cmp_icase(const struct config_cache_entry *e1,
> +				 const struct config_cache_entry *e2, const char* key)
> +{
> +	return strcasecmp(e1->key, key ? key : e2->key);
> +}
> +
> +static void config_cache_init(void)
> +{
> +	hashmap_init(&config_cache, (hashmap_cmp_fn)config_cache_entry_cmp_icase, 0);
> +}

It looks like this is meant to treat config keys like "diff.renames" as
case-insensitive. That's OK for two-part names, but longer names with a
middle section, like "remote.Foo.url", should have their middle sections
be case-sensitive.

Also, please avoid casting function ptrs. Give your function the correct
hashmap_cmp_fn signature, and cast from "void *" internally if you need
to.

> +static void config_cache_free(void)
> +{
> +	struct config_cache_entry* entry;
> +	struct hashmap_iter iter;
> +	hashmap_iter_init(&config_cache, &iter);
> +	while (entry = hashmap_iter_next(&iter)) {

Please wrap assignment inside a loop condition with an extra (), like:

  while ((entry = hashmap_iter_next(&iter)))

This suppresses gcc's "did you mean ==" warning. If you are not
compiling with "-Wall -Werror" for development, you probably should be.

> +static struct config_cache_entry *config_cache_find_entry(const char *key)
> +{
> +	struct hashmap_entry k;
> +	hashmap_entry_init(&k, strihash(key));
> +	return hashmap_get(&config_cache, &k, key);
> +}

As above, strihash isn't quite right here. It needs a custom function
that takes into account the case-sensitivity of the middle section of
the key name.

An alternative is to only feed already-normalized names into the hash
code, and then tell the hash to be case-sensitive.

> +static void config_cache_set_value(const char *key, const char *value)
> +{
> +	struct config_cache_entry *e;
> +
> +	e = config_cache_find_entry(key);
> +	if (!e) {
> +		e = malloc(sizeof(*e));
> +		hashmap_entry_init(e, strihash(key));
> +		e->key=xstrdup(key);

Style: please keep whitespace around "=".

> +	} else {
> +		if (!unsorted_string_list_has_string(e->value_list, value))
> +			string_list_append(e->value_list, value);
> +	}

I don't think we want to suppress duplicates here. If a caller reads the
value with git_config_get_string(), then the duplicates do not matter
(we show only the one with precedence). If they use the "multi()" form,
then they _should_ see each version. It is not up to the config code to
decide that duplicate values for the same key are not interesting; only
the caller knows that.

> +static void config_cache_remove_value(const char *key, const char *value)
> +{
> +	struct config_cache_entry *e;
> +	int i;
> +
> +	e = config_cache_find_entry(key);
> +	if(!e)

Style: if (!e)

> +	/* If value == NULL then remove all the entries for the key */
> +	if(!value) {

Ditto (and elsewhere, but I'll stop noting each).

> +extern const char *git_config_get_string(const char *name)
> +{
> +	struct string_list *values;
> +	values = config_cache_get_value(name);
> +	if (!values)
> +		return NULL;
> +	return values->items[values->nr - 1].string;
> +}

The assumption here is that values->nr is never 0. I think that is true,
because we do not create the cache entry until we get a value (and
remove it if we drop the last value). However, it might be worth
documenting that with an assertion.

> @@ -333,6 +441,7 @@ static int get_value(config_fn_t fn, void *data, struct strbuf *name)
>               if (!value)
>                       return -1;
>       }
> +     config_cache_set_value(name->buf, value);
>       return fn(name->buf, value, data);
>  }

The function get_value gets called for each entry from a parsed config
file. However, it does not get called for command-line config from "git
-c". You'd need to update git_config_parse_parameter for that.

However, I wonder if this is the right place to hook into the cache at
all. The cache should be able to sit _atop_ the existing callback
system. So you could easily implement it as a separate layer, and just
lazily initialize it like:

  static int config_cache_callback(const char *key, const char *val, void *cache)
  {
	config_cache_set_value(cache, key, val);
	return 0;
  }

  static struct hashmap *get_config_cache(void)
  {
	static int initialized;
	static struct hashmap cache;
	if (!initialized) {
		hash_map_init(&cache, 
		git_config(config_cache_callback, &cache);
		initialized = 1;
	}
  }

and then teach config_cache_get_value and friends to access the cache
through "get_config_cache".

> @@ -1570,6 +1684,7 @@ int git_config_set_multivar_in_file(const char *config_filename,
>  		if (!store_write_section(fd, key) ||
>  		    !store_write_pair(fd, key, value))
>  			goto write_err_out;
> +		else config_cache_set_value(key, value);
>  	} else {
>  		struct stat st;
>  		char *contents;
> @@ -1673,6 +1788,9 @@ int git_config_set_multivar_in_file(const char *config_filename,
>  			}
>  			if (!store_write_pair(fd, key, value))
>  				goto write_err_out;
> +			else config_cache_set_value(key, value);
> +		} else {
> +			config_cache_remove_value(key, NULL);
>  		}
>  
>  		/* write the rest of the config */

These two seem suspect to me. How do we know that config_filename that
we are manipulating is even referenced by the config cache? When we
remove an entry, how do we know we are getting the entry from that file,
and not a similar entry in another file?

Stepping back, though, do we need to keep the cache up to date here with
micro-changes at all? If we have updated the config files on disk, then
the simplest thing would be to just invalidate the whole cache (and then
lazily re-read it if we actually need it again). This should be a rare
case (I suspect that "git init" and "git clone" might want this, but
that's about it).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]