Re: [PATCH 16/18] fsck: support demoting errors to warnings

Michael Haggerty <mhagger@xxxxxxxxxxxx> · Thu, 22 Jan 2015 16:49:31 +0100

On 12/23/2014 06:14 PM, Junio C Hamano wrote:
> Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes:
>> On Tue, 23 Dec 2014, Junio C Hamano wrote:
>>> I suspect that it would be much better if the configuration variables
>>> were organized the other way around, e.g.
>>>
>>> 	$ git config fsck.warn missingTagger,someOtherKindOfError
>>
>> I had something similar in an earlier version of my patch series, but it
>> was shot down rightfully: if you want to allow inheriting defaults from
>> $HOME/.gitconfig, you have to configure the severity levels individually.
> 
> Hmmm.  What's wrong with "fsck.warn -missingTagger" that overrides
> the earlier one, or even "fsck.info missingTagger" after having
> "fsck.warn other,missingTagger,yetanother", with the usual "last one
> wins" rule?
> 
> Whoever shot it down "rightfully" is wrong here, I would think.

Sorry I didn't notice this earlier; Johannes, please CC me on these
series, especially the ones that I commented on earlier.

I might have been the one who "shot down" the "<severity>=<name>" style
of configuration [1].

I don't feel strongly enough to make a big deal about this, especially
considering that the other alternative has already been implemented. But
for the record, let me explain why I prefer the "<name>=<severity>"
style of configuration.

First, it is a truer representation of the data structure within the
software, which is basically one severity value for each error type.
This is not a decisive argument, but it often means that there is less
impedance mismatch between the style of configuration and the concepts
that it is configuring. For example,

    $ git config receive.fsck.warn A,B,C
    $ git config receive.fsck.error C,D,E

seems to be configuring two sets, but it is not. It is mysteriously
setting "C" to be an error, in seeming contradiction of the first line [2].

Second, it is not correct to say that this is just an application of the
"last setting wins" rule. The "last setting wins" rule has heretofore,
as far as I know, only covered *single* settings that take a single
value. If we applied that rule to the following:

    $ git config receive.fsck.warn A,B,C
    $ git config receive.fsck.warn B,F

then the net result would be "B,F". But that is not your proposal at
all; your proposal is for these two settings to be interpreted the same as

    $ git config receive.fsck.warn A,B,C,F

Similarly, the traditional last setting rule, applied to the first
example above, wouldn't cause the value of "fsck.warn" to be reduced to
"A,B", as you propose. This is not the "last setting rule" that we are
familiar with--it operates *across and within* values and across
*multiple* names rather than just across the values for a single name.

Third, the "<severity>=<name>" style is hard to inquire via the command
line, and probably also incompatible with the simplified internal config
API in git (and probably libgit2, JGit, etc). The problem is that
determining a *single* setting requires *three* configuration variables
be inquired, and that the settings for those three variables need to be
processed in the correct order, including the correct order of
interleavings. For example, how would you inquire about the configured
severity level of "missingTaggerEntry" using the shell? It would be a
mess that would necessarily have to involve "git config --get-regexp"
and error-prone parsing of comma-separated values. It would be so much
easier to type

    $ git config receive.fsck.missingtaggerentry

Fourth, the "<severity>=<name>" style would cause config files to get
cluttered up with unused values. Suppose you have earlier run

    $ git config receive.fsck.warn A,B,C
    $ git config receive.fsck.ignore D,E

and now you want to demote "B" to "ignore". You can do

    $ git config --add receive.fsck.ignore B

(don't forget "--add" or you've silently erased other, unrelated
settings!) This gives the behavior that you want. But now your config
file looks like

    [receive "fsck"]
            warn = A,B,C
            ignore = D,E
            ignore = B

The "B" on the first line is now just being carried along for no reason,
but it would be quite awkward to clean it up programmatically.
Effectively, these settings can only be added to but never removed
because of the way multiple properties are mashed into a single setting.

I believe that one of the main arguments for the "<severity>=<name>"
style of configuration is that it carries over more easily into
convenient command-line options. But I think it will be unusual to want
to configure these options by hand on the command line, let alone adjust
many settings at the same time. The idea isn't to make it easy to work
with repositories that have a level of breakage that fluctuates over
time. It is to make it possible to work with *specific* repositories
that have known breakage in their history. For such a repo you would
configure one or two "ignore" options one time and then never adjust
them again. (And it will also allow us to make our checks stricter in
the future without breaking existing repositories, and even to add
optional "policy" checks, like "forbid Windows-incompatible filenames".)

I would even go so far as to say that we don't *need* command-line
option versions of these settings; if somebody really needs that they
can type

    $ git -c receive.fsck.missingtaggerentry=ignore fsck

(which also has the advantage of passing the setting through to any
child processes). But *if* command-line options are considered
necessary, I don't think that using a "<name>=<severity>" style within
the config needs to rule out allowing command-line options in the form
"--<severity>=<name>,<name>" as Junio has suggested.

Looking back at this email, I guess that I'm more strongly against the
"<name>=<severity>" configuration style than I thought :-/

Michael

[1] I prefer to think that I just offered a little gentle discussion
that informed Johannes's independent decision :-)

[2] But even on these terms, it is anomalous. The usual git way to
configure a set in git would be

    $ git config receive.fsck.warn A
    $ git config --add receive.fsck.warn B
    $ git config --add receive.fsck.warn C
    $ git config receive.fsck.error C
    $ git config --add receive.fsck.error D
    $ git config --add receive.fsck.error E

, which in fact has fewer of the disadvantages listed in this email.

-- 
Michael Haggerty
mhagger@xxxxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html