Semantic man(7) markup (was: Linux man-pages Makefile portability)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At 2022-07-24T17:57:31+0200, Ingo Schwarze wrote:
> Alejandro Colomar wrote on Sun, Jul 24, 2022 at 01:09:23PM +0200:
> > On 7/23/22 20:16, Ingo Schwarze wrote:
> >> The most widely used way to look up manual pages by the names of
> >> symbolic constants or type names probably is using macro keys as
> >> implemented in the mandoc version of apropos(1).  That is used by
> >> most FreeBSD, OpenBSD, Alpine Linux, and Void Linux.  I admit that
> >> doesn't qualify as "widely used", but "most widely used" is
> >> probably true all the same.  ;-)
> 
> > That leaves out man(7).

Perhaps not for long...

> Yes.  Searching for preprocessor constants and searching for data
> type names are essentially semantic search features.  So it is
> your choice to pick a 197x-era markup language that does not provide
> semantic markup but only physical markup.  But than it feels
> irrational to me to turn around and complain not getting semantic
> search.  Unless you are a Prime Minister, you cannot have your cake
> and eat it.
> 
> Trying to work around the lack of semantic markup by moving
> everything into the manual page names feels like very poor design
> to me.

It will not surprise, but might horrify, Ingo to learn that I have an
idea for how to add semantic markup to man(7).

Consider this hypothetical example.

  $ cat man3/man-pages.man
  .DC type B
  .DC field I
  $ cat man3/tm.3type
  .so man3/man-pages.man
[...]
  .SH DESCRIPTION
  .TG type "struct tm"
  describes time, broken down into distinct components.
  .PP
  .TG field tm_isdst
  describes wether daylight saving time is in effect at the time
  described.
[...]

Here, "DC" means "define class", a class of tags.  "TG", if one could
not guess, declares a tag of the type in its first argument with the
remaining arguments being the content thus tagged.

Returning to "DC", we see that it takes a second argument naming a macro
to call which will then apply any desired presentational markup to style
the tagged word.  This second argument need not be present.  In other
words, tagged content need not be visually distinct from its
surroundings.  Even in that event, it can still be useful; see #1 below.

Further, it will be obvious to the experienced *roff user that the macro
called by DC to style the applicable arguments given to TG need not even
be part of the man(7) language.

You could populate "man-pages.man" like this.

  $ cat man3/man-pages.man
  .de CW
  .  ie t \&\f[CR]\\$*\f[]
  .  el   \&\\$*
  ..
  .DC type CW
  .DC field I

This technique breaks the stranglehold of the man(7) font selection
macros.  (You're still limited by the output device's font repertoire,
however.)  If rendering to PostScript or PDF, you could decide to style
certain tags in Zapf Chancery Medium italic, if you wished.  (I cannot
warrant that you won't get yelled at.)

Here are a few perhaps less obvious things this approach would offer.

1.  It enables keyword search by tag.  Whatever does the searching need
    only look for "TG" calls, match the class argument, and return the
    remainder.  A search could be narrowed by limiting both the class
    _and_ the keyword arguments of course, perhaps to answer questions
    like "what pages use 'stat' as data type?".

2.  Degraded operation for other/older man(7) implementations is
    straightforward.  'DC' can be completely ignored.  'TG' can be
    defined as follows.

    .de TG
    \&\\$*
    ..

    or, for truly bloody-minded portability, thus.

    .de TG
    \&\\$1 \\$2 \\$3 \\$4 \\$5 \\$6 \\$7 \\$8 \\$9
    ..

3.  Everyday man(7) page authors need only learn 'TG' and the available
    list of keywords for the suite of man pages to which they are
    contributing.  Hammering out the repertoire of available tag classes
    and the surely monumental bikeshedding of text styling decisions to
    be associated with each tag class is delegated to the project that
    chooses to define them.  The man(7) macro package itself will impose
    no policy and may not even define any tag classes to start with.
    (groff would have some for its own man pages, of course, as I would
    expect Linux man-pages to do.)

4.  Site admins offended at the styling decisions undertaken by various
    projects could reliably override them by editing the files sourced
    by the relevant man pages.  Maybe those should live in /etc rather
    than the man page hierarchy proper.

5.  Misspelling a tag class or using an unavailable one is an error that
    would be easily diagnosed and reported.

To reiterate, groff man(7) would impose no policy regarding the tag
classes or their rendering on anyone.  It similarly would escape the
ongoing problem that mdoc(7) chose for itself by administering
centralized authoritative lists of standards documents, operating system
releases, and other lexica.  Tagful man(7) pages under my proposal would
opt into whatever keyword/class discipline they desire, or not at all.

I am not wedded to the nomenclature for the included files, nor the `DC`
or `TG` macros, except to note that the macro names are available.
(`DT`, putatively for "define tag", is not.  It is already taken.)

I stand ready for the hail of rotten tomatoes.

Regards,
Branden

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux