Re: [PATCH v3 2/3] config: add hashtable for config parsing & retrieval

Junio C Hamano <gitster@xxxxxxxxx> · Wed, 25 Jun 2014 11:13:55 -0700

Ramsay Jones <ramsay@xxxxxxxxxxxxxxxxxxx> writes:

> On 24/06/14 00:25, Junio C Hamano wrote:
> ...
>> Yup, that is a very good point.  There needs an infrastructure to
>> tie a set of files (i.e. the standard one being the chain of
>> system-global /etc/gitconfig to repo-specific .git/config, and any
>> custom one that can be specified by the caller like submodule code)
>> to a separate hashmap; a future built-in submodule code would use
>> two hashmaps aka "config-caches", one to manage the usual
>> "configuration" and the other to manage the contents of the
>> .gitmodules file.
>> 
>
> I had expected to see one hash table per file/blob, with the three
> standard config hash tables linked together to implement the scope/
> priority rules. (Well, these could be merged into one, as the current
> code does, since that makes handling "multi" keys slightly easier).

Again, good point.  I think a rough outline of a design that take
both

 (1) we may have to read two or more separate sets of "config like
     things" (e.g. the contents from the normal config system and
     the contents from the .gitmodules file) and

 (2) we may have to read two or more files that make up a logically
     single set of "config-like things" (e.g. the "normal config
     system" reads from three separate files)

into account may look like this:

 * Each instance of in-core "config-like things" is expressed as a
   struct "config-set".

 * A "config-set" points at an ordered set of struct "config-file",
   each of which represents what was read and cached in-core from a
   file.

 * When we know or notice that a single file on the filesystem was
   modified, we do not have to invalidate the whole "config-set"
   that depends on the file; the "config-file" that corresponds to
   the file on the filesystem is invalidated instead.

 * The most generic API to read the values for a given key or
   enumerate the keys in a set of "config-like things" takes
   "config-set" as an argument, and reads from the ordered set of
   "config-file" to keep the established illusion that we read them
   all and accumulate, leading to "the last one wins" for single
   valued variables.

 * Because reading from the normal config system happens everywhere
   in the existing code, we will have one struct "config-set"
   instance, called "the_config_set", and have a parallel API of the
   most generic API above, that do not take the "config-set" as an
   explicit argument.  They operate on the_config_set singleton
   instance.

The implementation of the API function "git-config-get-string" may
look like this:

	int git_config_get_string(const char *key, const char **value)
	{
        	return git_configset_get_string(&the_config_set, key, value);
	}

which is the "thin-wrapper" for the more generic API to allow you
read from an arbitrary config_set, which may even look like this:

	#define git_config_get_string(k, v) \
        	git_configset_get_string(&the_config_set, (k), (v))

When the submodule script that uses "git config -f .gitmodules" is
converted into C, if the updated config API is ready, it may be able
to do something like these in a single program:

	const char *url;
	struct config_set *gm_config;

        /* read from $GIT_DIR/config */
        url = git_config_get_string("remote.origin.url");

        /*
         * allow us to read from .gitmodules; the parameters are
         * list of files that make up the configset, perhaps.
         */
	gm_config = git_configset_new(".gitmodules", NULL);

        if (!git_configset_get_bool(gm_config, "submodule.frotz.ignore")) {
		/* do a lot of stuff for the submodule */
                ...
	}

        /* when we are done with the configset */
        git_configset_clear(gm_config);

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html