Re: [PATCH v3 4/4] Documentation: add lint-fsck-msgids

Junio C Hamano <gitster@xxxxxxxxx> · Thu, 27 Oct 2022 22:32:30 -0700

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes:

> First, for the purposes of this thread I think Jeff and I are far off
> into the weeds here :)

It is good to clearly separate where we would want to draw the line
for this round, to get the already-worked-on-and-immediately-available
improvement going, while we envision a future direction for the longer
term.

> But if we go for that: I think in this case & most I can think of
> generating the code from the docs is better (as that rough POC I had
> showed), because:
>
>  - You just need a shellscript to scrape the docs to make a *.c or *.h,
>    whereas you'd need a C compiler to make the docs if it's the other
>    way around. But more importantly:
>
>  - The docs are way easier to scrape with some sed/awk/grep/whatever
>    few-liner than to scrape C code for generating docs. E.g. see
>    config-list.h.

Scraping docs is easier because we do not have to choose from
multiple choices that are all reasonable ;-).  Either way, the
source material needs some discipline to keep it scrapable (e.g. to
keep the doc scrapable, you'd probably keep each entry a single
line, or a fixed format like "<token>::" followed by "^I(<token>) "
followed by description.  Nothing forbids us from giving developers
a similar rule to keep each entry in FOR_EACH_MSG_ID() macro easier
to scrape, so it is about the same difficulty going either way.

But if you choose to make the code the source of truth, you'd have
to see if it makes more sense to "compile and run" instead of
scraping.  That's another thing to consider and choose from, which
makes it harder ;-)

> But mainly it helps to have a use-case, replacing the linter script with
> e.g. the *.sh I demo'd might be a marginal improvement. But e.g. "git
> help -c" uses one of those generated files (config-list.h), and actually
> does something useful ...

Yes, I've shown how N_("explanation of the error") may fit into the
existing scheme in a separate message upthread.  If we go from code
to doc, it would be a reasonable starting point.

Whichever way the aout-generation goes, we'd complicate the build
dependency graph, which is a downside.  Another is that third-party
consumers of docs now need to generate some docs from the source,
which may be additional burden for them.

> Is there a good use-case for the fsck data like that? I'd think that
> we'd want to make sure the docs are in sync with the code, as in we're
> not adding new warnings/errors etc. without documenting them. But beyond
> that maybe not much, and people would just run "git help fsck" to get
> the list of variables..

"git help fsck-error-codes" that does not have a pre-generated
documentation (instead we'd just dump the N_("explanation") to the
output) is certainly tempting.  I am not sure if it would fly well.
When was the last time you saw a manpage that says "run this command
and view its output for the most up-to-date list" ;-)?