Re: [PATCH v3 4/4] Documentation: add lint-fsck-msgids

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Fri, 28 Oct 2022 05:11:07 +0200

On Thu, Oct 27 2022, John Cai wrote:

> Hi Ævar,
>
> On 26 Oct 2022, at 7:35, Ævar Arnfjörð Bjarmason wrote:
>
>> On Wed, Oct 26 2022, Jeff King wrote:
>>
>>> On Wed, Oct 26, 2022 at 04:43:32AM +0200, Ævar Arnfjörð Bjarmason wrote:
>>>
>>>>
>>>> On Tue, Oct 25 2022, Junio C Hamano wrote:
>>>>
>>>>> During the initial development of the fsck-msgids.txt feature, it
>>>>> has become apparent that it is very much error prone to make sure
>>>>> the description in the documentation file are sorted and correctly
>>>>> match what is in the fsck.h header file.
>>>>
>>>> I have local fixes for the same issues in the list of advice in our
>>>> docs, some of it's missing, wrong, out of date etc.
>>>>
>>>> I tried to quickly adapt the generation script I had for that, which
>>>> works nicely, and by line count much shorter than the lint :)
>>>
>>> Yeah, my instinct here was to generate rather than lint. If you make a
>>> mistake and the linter hits you over the head, that is better than
>>> quietly letting your mistake go. But better still is making it
>>> impossible to make in the first place.
>>>
>>> The downside is added complexity to the build, but it doesn't seem too
>>> bad in this case.
>>
>> Yeah, it's not, I have local patches to generate advice-{type,config}.h,
>> and builtin.h. This is a quick POC to do it for fsck-msgids.h.
>>
>> I see I forgot the .gitignore entry, so it's a rough POC :)
>>
>>> (I had a similar thought after getting hit on the head by the recent
>>> t0450-txt-doc-vs-help.sh).
>>
>> Sorry about that. FWIW I've wanted to assert that for a while, and to do
>> it by e.g. having the doc *.txt blurbs generated from running "$buildin
>> -h" during the build.
>
> If we wanted to go this route of generating the docs from the code (which sounds
> like a better long term solution), how would this work? Would we print out the
> list of message ids in builtin/fsck.c and write it to
> Documentation/fsck-msgids.txt ?

First, for the purposes of this thread I think Jeff and I are far off
into the weeds here :)

I think nothing needs to change in how this topic's doing things, we're
just takling about the longer term.

But if we go for that: I think in this case & most I can think of
generating the code from the docs is better (as that rough POC I had
showed), because:

 - You just need a shellscript to scrape the docs to make a *.c or *.h,
   whereas you'd need a C compiler to make the docs if it's the other
   way around. But more importantly:

 - The docs are way easier to scrape with some sed/awk/grep/whatever
   few-liner than to scrape C code for generating docs. E.g. see
   config-list.h.

 - Scraping the C code sucks so much that we'd probably make some
   dedicated interface for it, e.g. what we have for "git <cmd>
   --git-completion-helper".

   In that case it's worth it, but for other things we'n need to make
   the interface & maintain it (even if it's some test helper just for
   the build).

But mainly it helps to have a use-case, replacing the linter script with
e.g. the *.sh I demo'd might be a marginal improvement. But e.g. "git
help -c" uses one of those generated files (config-list.h), and actually
does something useful ...

Is there a good use-case for the fsck data like that? I'd think that
we'd want to make sure the docs are in sync with the code, as in we're
not adding new warnings/errors etc. without documenting them. But beyond
that maybe not much, and people would just run "git help fsck" to get
the list of variables..