On synopsis grammar (was: Spaces in synopses of commands)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[adding groff list so that more people can argue with me, since I once
again found a soapbox to mount]

At 2023-07-30T18:14:53+0200, Alejandro Colomar wrote:
> On 2023-07-30 18:13, G. Branden Robinson wrote:
> > I think this is a matter of achieving an accurate and unambiguous
> > synopsis grammar.
> 
> Thanks; that kind of objective reasoning is what I wanted.  Would you
> mind stating it in the commit message for posterity?  :-)

I think I'll add it to the explanation of the example synopsis in
groff_man_style(7), too.  ;-)

While I'd love for synopsis grammar to be _fully_ unambiguous, one
unfortunate case did arise in discussion with mandoc maintainer Ingo
Schwarze on the groff mailing list in the past year or two.

Consider:

foocmd [-abort] file ...

Is this a command that takes up to 5 different options -a, -b, -o, -r,
-t, or a command that takes one option called "abort"?

A program in the BSD tradition might suggest one answer and a program in
the X11 tradition another.  I assume that this is not a new observation,
and is why the GNU project introduced (or adopted from some
now-forgotten progenitor) the double-dash long-option-name convention.

While we could eliminate the ambiguity by insisting upon a practice of
setting each short option in its own set of optional-argument brackets,
that would come at a significant cost in visual clutter.

Consider the groff(1) command, already ornamented richly with options.

    groff [-abcCeEgGijklNpRsStUVXzZ] [-d ctext] [-d string=text]
          [-D fallback‐encoding] [-f font‐family] [-F font‐directory]
          [-I inclusion‐directory] [-K input‐encoding] [-L spooler‐
          argument] [-m macro‐package] [-M macro‐directory] [-n page‐
          number] [-o page‐list] [-P postprocessor‐argument]
          [-r cnumeric‐expression] [-r register=numeric‐expression]
          [-T output‐device] [-w warning‐category] [-W warning‐category]
          [file ...]

In a quest for zero ambiguity, we might say:

    groff [-a] [-b] [-c] [-C] [-e] [-E] [-g] [-G] [-i] [-j] [-k] [-l]
          [-N] [-p] [-R] [-s] [-S] [-t] [-U] [-V] [-X] [-z] [-Z]
          [-d ctext] [-d string=text] [-D fallback‐encoding]
          [-f font‐family] [-F font‐directory] [-I inclusion‐directory]
          [-K input‐encoding] [-L spooler‐ argument] [-m macro‐package]
          [-M macro‐directory] [-n page‐number] [-o page‐list]
          [-P postprocessor‐argument] [-r cnumeric‐expression]
          [-r register=numeric‐expression] [-T output‐device]
          [-w warning‐category] [-W warning‐category] [file ...]

And with that done, we might as well lexicographically order all the
options.

    groff [-a] [-b] [-c] [-C] [-d ctext] [-d string=text]
          [-D fallback‐encoding] [-e] [-E] [-f font‐family]
          [-F font‐directory] [-g] [-G] [-i] [-I inclusion‐directory]
          [-j] [-k] [-K input‐encoding] [-l] [-L spooler‐argument]
          [-m macro‐package] [-M macro‐directory] [-n page‐number] [-N]
          [-o page‐list] [-p] [-P postprocessor‐argument]
          [-r cnumeric‐expression] [-r register=numeric‐expression]
          [-R] [-s] [-S] [-t] [-T output‐device] [-U] [-V]
          [-w warning‐category] [-W warning‐category] [-X] [-z] [-Z]
          [file ...]

...but that doesn't seem like an improvement to me.  Options that don't
take arguments are typically of Boolean sense.  (Occasionally, as with
some applications of '-v', they model an incrementation operation of
some kind.)  "Argumentful" options require further decision-making from
the user and it thus seems useful, to me, to segregate the two
categories.  Some traditions evolve for good reasons.  :)

As an aside, one might wonder why the groff(1) page uses such long
metasyntactic variable names in 1.23.0 when it did not in 1.22.4.  After
years of working on groff's ~60 man pages, I came to adopt a handful of
principles.

1.  A command should always offer a usage message via '--help',
    presenting a (plain text) synopsis much like the above.

2.  That synopsis, and the one in the corresponding man page, should
    match.

3.  A _usage_ message should be _useful_.

    $ foo --barblegarg
    foo: error: unrecognized option 'barblegarg'
    foo: usage: foo [options] [files]

    is so un-useful as to be user-hostile.  A programmer who writes this
    should be frank about their contempt for the user and drop such
    "usage advice" entirely.[1]

    Consider the novice user of groff.  They might wonder, "is lowercase
    'm' the flag letter for the macro package name and '-M' the one to
    add a macro search directory, or the other way around"?  Output like
    I presented for it above answers such a question.

4.  A usage message should not dump an _explanation_ of all options.  A
    person accustomed to the Unix command line philosophy of "no news is
    good news" will rightly be dismayed when a command invocation they
    expect to perform some task quietly and return to the shell prompt
    instead spews a gout of text to the terminal.  If many options are
    supported, and/or their explanations demand much space to present,
    the _actual problem_ with the command can easily scroll away.  Yes,
    maybe everybody has terminals with scrollback buffer these days, but
    it's still rude.  When something has gone wrong, a user's immediate
    response should not be to pound on the keyboard some more, but to
    pause, take a breath if necessary, and gather useful information
    from the screen.  If our "helpful" command hasn't left the most
    important information _on_ the screen, that's harder to do.

5.  It's okay to present a lengthy usage message, with much detail, if
    a user explicitly requests "--help".  But because lengthy runs of
    text can get out of sync, I prefer to maintain such things in one
    place--the command's man page.

6.  Ideally, you'd store things like metasyntactic variable names for
    command-line options in a data structure inside the command's
    sources, and a mechanism, possibly an environment variable or an
    otherwise "maintainer mode" command-line option, would dump a
    well-formed synopsis in man(7) format[2] using this information to
    the standard output.  As part of package build, one could then apply
    this output to a templated man page document to produce the shipping
    page.

    I first had this idea something like 25 years ago and I'm sure many
    other people have, too, it being such an obvious application of the
    DRY principle.  I can only guess that it didn't happen because
    getopt_long() is a GNU thing; GNU people (okay, let's be precise:
    GNU Emacs people), historically, have held man pages beneath
    contempt; and nobody else had both the traction and desire to get it
    done.  (Engineers paid to work on or adjacent to the Linux kernel
    seem always to have struggled, either with themselves or their
    managers, to justify expending more than a minimal effort on
    documentation of any sort.  Thus did both sides of GNU/Linux's white
    picket fence become brownfields.)

7.  One place we _don't_ need information rich metasyntactic variable
    names is where we're going to spend a lot of words explaining them
    anyway.  So after over-applying a principle of militant synchrony,
    I found that "Options" sections of man pages[3] could get by
    pedagogically just as well with short ones; they were easier to cope
    with typographically as well, improving the regularity of formatting
    (which is helpful to the reader, visually) and reducing the need for
    *roff stunts in man page sources to achieve consistent indentation
    in a series of tagged paragraphs.

    Consider again groff(1) options.  Here's the synopsis/usage message
    again, abbreviated.

    groff [-K input‐encoding] [-L spooler‐argument] [-T output‐device]

    And here's the corresponding material from its man page's "Options"
    section.

       -K enc Set input encoding used by preconv(1) to enc; implies -k.

       -L arg Pass  arg  to the print spooler program.  If multiple args
              are required, pass each with a separate -L option.   groff
              does not prefix an option dash to arg before passing it to
              the spooler program.

       -T dev Direct troff to format the input  for  the  output  device
              dev.  groff then calls an output driver to convert troff’s
              output to a form appropriate for dev; see subsection “Out‐
              put devices” below.

    (I haven't forgotten that you prefer two spaces between a tag and
    the body of a tagged paragraph.  I agree that it would look better.
    I still intend to add a tunable parameter for that [defaulting to
    2n], probably around the same time I do so for the base paragraph
    indentation amount.)

    We don't need a metasyntactic variable name as long as your arm when
    explaining fully in adjacent text what the parameter means.  At the
    same time, replacing all such names in the foregoing example with
    just "x" would be laconic to excess.

The Unix-Haters' Handbook is due for a second edition, isn't it?  ;-)

Regards,
Branden

[1] "Use the source, Luke," a.k.a. "see Figure 1."

[2] or some JSONic thing easily transformed into man(7) or another
    desired format

[3] Full disclosure: mandoc maintainer and mdoc(7) advocate Ingo
    Schwarze opposes the existence of "Options" sections in man pages.

    https://lists.gnu.org/archive/html/groff/2018-11/msg00031.html

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux