Re: [PATCH v4 3/9] config API: add and use a "git_config_get()" family of functions

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Tue, 07 Feb 2023 11:29:15 +0100

On Thu, Feb 02 2023, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@xxxxxxxxx> writes:
>
>> We already have the basic "git_config_get_value()" function and its
>> "repo_*" and "configset" siblings to get a given "key" and assign the
>> last key found to a provided "value".
>>
>> But some callers don't care about that value, but just want to use the
>> return value of the "get_value()" function to check whether the key
>> exist (or another non-zero return value).
>>
>> The immediate motivation for this is that a subsequent commit will
>> need to change all callers of the "*_get_value_multi()" family of
>> functions. In two cases here we (ab)used it to check whether we had
>> any values for the given key, but didn't care about the return value.
>
> So, the idea is that 
>
> 	if (!git_config_get_string(key, &discard))
> 		free(discard);
> 	else
> 		... the key is missing ...
>
> becomes
>
> 	if (git_config_get(key))
> 		... the key is missing ...
>
> In other words, git_config_get() returns 0 only when the key is
> used, and non-zero return signals that the key is not used?
>
> Similarly, get_value_multi() was an interface to get to the
> value_list associated to the given key, and was abused like
>
> 	if (git_config_get_value_multi(key))
> 		... the key exists ...
>
> which will become
>
> 	if (!git_config_get(key))
> 		... the key exists ...
>
> right?

Yes, I've amended this in a re-roll to add tests to the configset tests,
so how the new API should be used becomes obvious.

>> @@ -3185,7 +3184,7 @@ static void configure_added_submodule(struct add_data *add_data)
>>  	 * is_submodule_active(), since that function needs to find
>>  	 * out the value of "submodule.active" again anyway.
>>  	 */
>> -	if (!git_config_get_string_tmp("submodule.active", &val)) {
>> +	if (!git_config_get("submodule.active")) {
>
> string_tmp() variant is to retrieve borrowed value, and it returns 0
> when there is a value.  If it is a valueless true, we get -1 back
> with an error message.  What does the updated version do in the
> valueless true case?

No, we'll get back 0, as a value-less key exists. 

This sort of code would be correct in general, as if we really want to
check if the key exists a value-less is a valid boolean true.

In this case we happen to segfault later on if we encounter such a key,
but that's unrelated to this being correct.

The segfault is then fixed later in this topic.

>> @@ -2425,8 +2433,25 @@ int git_configset_get_value(struct config_set *cs, const char *key, const char *
>>  
>>  const struct string_list *git_configset_get_value_multi(struct config_set *cs, const char *key)
>>  {
>> -	struct config_set_element *e = configset_find_element(cs, key);
>> -	return e ? &e->value_list : NULL;
>> +	struct config_set_element *e;
>> +
>> +	if (configset_find_element(cs, key, &e))
>> +		return NULL;
>> +	else if (!e)
>> +		return NULL;
>> +	return &e->value_list;
>> +}
>
> OK.  !e means "we got a healthy key and peeking into the hash table,
> there wasn't any entry for the key", and that is reported with NULL.
> Do we evern return a string list with .nr == 0, I wonder.  Having to
> deal with such a list would make the caller's job more complex, but
> perhaps we are not allowing the code to shrink value_list.nr to
> avoid such a situation?

No, we never return a list with .nr == 0. That's why Stolee's earlier
RFC tried to solve a subset of the problem this topic addresse by
returning such a list as a way to indicate "does not exist".

That would work to an extent, but would leave the main problem (which
Stolee wasn't aware of at the time) of having a ".nr > 0" list with
"NULL" elements in it.

It also wouldn't be idiomatic with the rest of the API, as this topic
shows, and means you can't distinguish non-existence from other errors.

>> +int git_configset_get(struct config_set *cs, const char *key)
>> +{
>> +	struct config_set_element *e;
>> +	int ret;
>> +
>> +	if ((ret = configset_find_element(cs, key, &e)))
>> +		return ret;
>> +	else if (!e)
>> +		return 1;
>> +	return 0;
>>  }
>
> OK.  So 0 return from the function means there is a value (or more)
> for a given key.  Good.
>
>> diff --git a/config.h b/config.h
>> index ef9eade6414..04c5e594015 100644
>> --- a/config.h
>> +++ b/config.h
>> @@ -471,9 +471,12 @@ void git_configset_clear(struct config_set *cs);
>>  
>>  /*
>>   * These functions return 1 if not found, and 0 if found, leaving the found
>> - * value in the 'dest' pointer.
>> + * value in the 'dest' pointer (if any).
>>   */
>
> Now the returned non-zero values are no longer 1 alone, no?
> Whatever lower-level functions use to signal an error is propagated
> up with the
>
> 	if ((ret = func())
> 		return ret;

That's still mostly true, but I should have added the new
git_configset_get() below this comment, and split it up, i.e. it's still
true for the functions that populate "dest".

I have a topic-on-top to fix those (to propagate them up), and it was
hard to know where to draw the line, in this case I got it wrong. Will
fix it.