Re: Problem in prepare.pl (PDF book script) when handling Unix V10 manual pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Monday, 17 February 2025 22:22:46 GMT G. Branden Robinson wrote:

> [CCing groff@gnu list because some problems arise here that merit being

>  findable by search of its list archives]

>

> Hi Deri,

>

> At 2025-02-17T18:52:46+0000, Deri wrote:

> > >     programs in constructed pipeline:

> > >    

> > >     GNU grops (groff) version 1.23.0.2695-49927

> > >     GNU troff (groff) version 1.23.0.2695-49927

>

> [...]

>

> > Since the v10 pages are intended to run on a version of troff with a

> > two character name limit (I think). Code such as ".ne4" cause a

> > problem for groff, which needs ".ne 4" to work (otherwise groff looks

> > for a macro called "ne4" and fails. Many of these issues are now

> > corrected.

>

> We do have compatibility mode to support old-style AT&T troff input.

>

> troff(1):

>      -C       Enable AT&T troff compatibility mode; implies -c.  See

>               groff_diff(7).

>

> However...

>

> [skipping ahead]

>

> > but changing some "$" to "\[Do]" fixed the problem.

>

> ...if you're doing that, you foreclose use of `\[Do]` for 2 reasons.

>

> 1.  That syntax is a groff extension (the AT&T troff form would be

>     `\(Do`)...but worse...

> 2.  `Do` is not a special character identifier generally recognized by

>     AT&T-family troffs.  And there's no way within the AT&T *roff

>     language to define new ones.  Fortunately, in Kernighan troff, it's

>     not hard to add them to font description files.  As long as you have

>     superuser privileges.


Hi Branden,


My prepare.pl is only supported for our groff, so I have no interest in making it compatible to AT&T troffs.


> > A strange issue is that if a page contained a "$" character it sent

> > eqn into the stratosphere (thinking was dealing with an inline

> > equation), I killed it when eqn chewed up over 24gb of memory. I have

> > no idea why, and it is not triggered by a single page containing a

> > "$", so it must be triggered by something in an earlier man page which

> > triggers it, but changing some "$" to "\[Do]" fixed the problem.

>

> I surmise that this book building system either runs groff with the `-e`

> option, or pipes the pages through eqn(1) explicitly, so that every page

> gets preprocessed by eqn.  That's not wrong--in fact it's probably the

> sanest thing to do--but it does expose you to scenarios like this.

>

> I'd bet a U.S. 50-cent piece that some page had this in it:

>

> .EQ

> ...

> delim $$

> ...

> .EN

>

> and then never did this later:

>

> .EQ

> ...

> delim off

> ...

> .EN

>

> ...because who ever formats more than one man page at a time?

>

> So upon encountering a `$` in an eqnless man page later, the eqn

> preprocessor would indeed then start gobbling up the entire remainder of

> the input for attempted conversion to troff input.

>

> GNU eqn added an option that strongly mitigates this and another

> problem:

>

> eqn(1):

>      -N      Prohibit newlines within delimiters, allowing eqn to

>              recover better from missing closing delimiters.

>

> ...and the groff(1) front-end exposes it too, for convenience:

>

> groff(1):

>      -N       Prohibit newlines between eqn delimiters: pass -N to

>               eqn(1).

>

> ...however before reaching for this solution, the corpus of pages being

> formatted needs to be audited to ensure that no multiline, inline use of

> eqn is attempted.  If it is, the pages must be altered to either:

>

> 1.  stop doing that--maybe by joining lines--enabling use of `-N`;

> 2.  migrate the "inline" math to EQ/EN bracketing (groff man(7) doesn't

>     define `EQ` and `EN` to set the math as a display, so this _should_

>     work okay), also enabling use of `-N`; or

> 3.  find the spot where `delim off` should have been and add it.


Alex is in charge of the workflow pipeline, prepare.pl runs first and produces a [con]catenation (I remembered your preference :-) ) of all the man pages which is then run through all the pre-processors and groff -Z and finally gropdf. I'm tempted by your 3rd idea, since I am doing other "fixes" this is trivial to add, although adding -N is the simplest it does require what may be a difficult audit.


> > One page redefined the ".P" man macro, which then affects all

> > following man pages.

>

> Naughty, naughty!  I've wondered in the past about adding support for

> "burning it all down and redefining all interface macros" in groff's

> "an.tmac" (specifically when hitting a new `TH`).[1]  But I decided that

> people wouldn't believe me that this was a practical hazard.  Thanks for

> pointing me to a real-world case!  :D

>

> > One page introduced a string register called "mc" which then masks the

> > groff command ".mc" with very strange results .

>

> That's not just a groff request name, but an AT&T one.  Hard to imagine

> how that isn't a bug, or at least a deeply unwise practice.  People

> might want to use {g,}diffmk(1) on man pages, and trashing the mechanism

> for setting up the margin character defeats such usage.

>

> Unfortunately man page authorship culture did not evolve in a direction

> such that people making changes to the formatter's environment (in the

> broad sense, not the *roff concept) put things back the way they found

> them.  Approximately every man page is written in the expectation that

> the formatter will exit once the last line of _this_ man page document

> is read.

>

> Just like how you don't need to bother to free heap-allocated memory in

> your programs unless you think _you'll_ need it.  It's the free store!

> Grab as much as you want and forget about it!  When your process dies

> the OS will reclaim it all anyway, no harm, no foul.

>

> It's no wonder Unix culture produced so many code cowboys.

>

> > Font L is used in many entries, no clue what font this is, but I

> > convert to font CB. Please change to taste (see lines 130 onwards).

>

> Good call.  `L` (presumably abbreviating "literal") was a latter-day

> Research Unix convention for font and macro names that I have not seen

> in materials originating outside the 1980s CSRC.  AT&T Documenter's

> Workbench (~1984-~1994), for example, did not appear to embrace it.

>

> > Several pages use lower case macro names, i.e. ".th" rather than

> > ".TH".

>

> Wow.  Those could be hangovers from pre-Seventh Edition Unix "man".

> But I thought Doug McIlroy got all of those ported/rewritten for Seventh

> Edition.

>

> Nevertheless, at least System III,[2] v8, and v10 retained support for

> Sixth Edition style man pages.  For example:

>

> $ head -n 5 v8/usr/lib/macros/an

> '''\"   PWB Manual Entry Macros - 1.36 of 11/11/80

> '''\"   Nroff/Troff Version     @(#)1.36

> .deth

> .tmwrong version of man entry macros - use -man6

> .ab

>

> So be careful out there if you don't want Dave Mustaine to snarl at you!

>

> > I have "fixed" a lot of the problems but there are still many warnings

> > when running groff. I have attached two parthes, one for the V10 man

> > pages, and one for prepare.pl. You should be able to produce a

> > "useful" book after applying these.

> >

> > If you wish to see the fruits of my labour as a pdf, it is here:-

> >

> > http://chuzzlewit.co.uk/UnixV10.pdf

>

> This looks really good!  It's wonderful to see a working, useful

> navigation pane, and at least some internal hyperlinks are working.

> Some aren't, and at a glance it's not obvious to me why.  (It's not the

> first argument to `TH` being in shouting capitals that hoses things, and

> that's not practiced with 100% reliability anyway--see as80(1) and

> ld80(1), for example.)


I think these are outliers, written in 1977, and using macro calls I have never heard of (.s1, .s3, .i0, ...). If you look at the V10 patch for the man pages you will see heavy editing of these pages, mainly changing anything I did not understand to .LP, since, whatever they are, they definitely expected a line break!


The majority of hyperlinks are fine, but remember that AT&T did not have a nice .MR macro to identify hyperlinks. The LinuxManBook is fairly consistent in using:-


.BR name (section)


Where fortunately this corresponds to the filename.section, rather than what is in the .TH line. The V10 corpus mainly uses:-


.IR name (section)


Again ignoring the actual .TH entry, but unfortunately it is not as consistent and there are anomalies. The 80.out.5 page codes it this way:-


.SH "SEE ALSO"

"as80" (I), "ld80" (I), "nm80" (I)


It's not really giving me much chance!


> In fact those two pages are a weird in a few respects.  Obvious spelling

> errors on the one hand ("moduals"?), and the latter uses a really old

> Unix manual convention, identifying the section numbers with roman

> numerals.

>

> Where modernization for PDF rendering purposes stops and the Research

> Tenth Edition Programmer's Manual, Volume 1 editorial effort begins anew

> may prove a difficult boundary to draw.


I have done my best with "difficult" source, I gave myself a pretty low target "produce something at least readable", rather than fidelity to how it would have looked printed in 1980. Manually working through the groff errors would certainly improve the finished product. I have attached my log.


Cheers


Deri


> Regards,

> Branden

>

> [1] One bad approach, IMO, would be to define all interface macros

>     except `TH` _inside_ its own definition.  Apart from being

>     super-disruptive for change tracking purposes, since it would touch

>     nearly every line in the macro file, I would expect this to be

>     harder to understand and maintain.  Nested macro definitions are

>     fully countenanced by the *roff language but not, I think, a widely

>     mastered technique.

>

>     Better, I think, would be to define all interface macros using "long

>     names", like `an*SH`, and then have `TH` redeclare the public names

>     as aliases, as in `.als SH an*SH`.

>

>     Care and testing would be required, as "andoc.tmac" uses the same

>     technique to permit switching between man(7) and mdoc(7) input.  I

>     am therefore not in a hurry to pick up this task, even though we do

>     already have automated tests to detect failure of such switching.

>

> [2] But not, interestingly, System V.

>     https://github.com/ryanwoodsmall/oldsysv/



troff:<standard input>:38: error: cannot load font 'TinosR' to mark it as special
an.tmac:apsend.1:47: warning: cannot nest .TP or .TQ inside .TP; supply a tag
an.tmac:at.1:92: warning: cannot nest .TP or .TQ inside .TP; supply a tag
troff:awk.1:77: warning: font name 'CW' is deprecated
troff:bcp.1:51: warning: character with input code 12 not defined
an.tmac:calendar.1:100: warning: cannot nest .TP or .TQ inside .TP; supply a tag
an.tmac:calendar.1:102: warning: cannot nest .TP or .TQ inside .TP; supply a tag
an.tmac:calendar.1:104: warning: cannot nest .TP or .TQ inside .TP; supply a tag
an.tmac:cbt.1:137: warning: cannot nest .TP or .TQ inside .TP; supply a tag
an.tmac:cbt.1:139: warning: cannot nest .TP or .TQ inside .TP; supply a tag
troff:cu.1:131: warning: expected numeric expression, got character '`'
troff:dag.1:100: warning: name 'BIwidth' not defined (possibly missing space after 'BI')
an.tmac:dag.1:101: warning: cannot nest .TP or .TQ inside .TP; supply a tag
troff:ftp.1:458: warning: cannot select font '\'
troff:gcc.1:500: error: a space character is not allowed in an escape sequence argument
troff:ideal.1:350: warning: name '..width' not defined (possibly missing space after '..')
troff:ideal.1:351: warning: name '..libfile' not defined (possibly missing space after '..')
troff:ideal.1:352: warning: name '..minx' not defined (possibly missing space after '..')
troff:ld80.1:105: error: cannot clear diversion trap when not diverting output
troff:memo.1:22: error: unterminated transparent embedding escape sequence
troff:mkstr.1:86: warning: name '..SH' not defined (possibly missing space after '..')
troff:mkstr.1:87: warning: name '..All' not defined (possibly missing space after '..')
troff:nm80.1:11: warning: expected numeric expression, got character 'N'
troff:sml.1:44: warning: expected numeric expression, got character 'e'
troff:snocone.1:218: warning: expected numeric expression, got character 'P'
troff:snocone.1:237: warning: expected numeric expression, got character 'I'
troff:splitrules.1:2: error: cannot open '/usr/man/man1/splitinf.1': No such file or directory
troff:tbl.1:271: warning: name 'sp3' not defined (possibly missing space after 'sp')
troff:gplot.1g:3: error: cannot load font 'G' for mounting
troff:pins.1g:0: error: cannot open 'CDL': No such file or directory
troff:pins.1g:24: warning: name 'sp.5' not defined (possibly missing space after 'sp')
troff:vtimes.2v:26: warning: expected numeric expression, got character 't'
troff:erf.3:39: warning: character with input code 4 not defined
troff:manip.3:112: warning: expected numeric expression, got character 'o'
troff:sbuf.prot.3:193: warning: expected numeric expression, got character 'e'
troff:strstream.3:64: warning: expected numeric expression, got character 't'
troff:plot.5:373: warning: expected numeric expression, got character 'd'
troff:ipa.6:0: error: cannot load font 'P1' from file 'IPA1' for mounting
troff:ipa.6:1: error: cannot load font 'P2' from file 'IPA2' for mounting
troff:ipa.6:39: warning: cannot select font 'L'
troff:ipa.6:39: warning: cannot select font 'L'
troff:ipa.6:39: warning: cannot select font 'L'
troff:ipa.6:39: warning: cannot select font 'L'
troff:ipa.6:39: warning: cannot select font 'L'
troff:ipa.6:40: warning: cannot select font 'L'
troff:ipa.6:40: warning: cannot select font 'L'
troff:ipa.6:40: warning: cannot select font 'L'
troff:ipa.6:40: warning: cannot select font 'L'
troff:ipa.6:40: warning: cannot select font 'L'
troff:ipa.6:41: warning: cannot select font 'L'
troff:ipa.6:41: warning: cannot select font 'L'
troff:ipa.6:41: warning: cannot select font 'L'
troff:ipa.6:41: warning: cannot select font 'L'
troff:ipa.6:41: warning: cannot select font 'L'
troff:ipa.6:42: warning: cannot select font 'L'
troff:ipa.6:42: warning: cannot select font 'L'
troff:ipa.6:42: warning: cannot select font 'L'
troff:ipa.6:42: warning: cannot select font 'L'
troff:ipa.6:42: warning: cannot select font 'L'
troff:ipa.6:43: warning: cannot select font 'L'
troff:ipa.6:43: warning: cannot select font 'L'
troff:ipa.6:43: warning: cannot select font 'L'
troff:ipa.6:43: warning: cannot select font 'L'
troff:ipa.6:43: warning: cannot select font 'L'
troff:ipa.6:44: warning: cannot select font 'L'
troff:ipa.6:44: warning: cannot select font 'L'
troff:ipa.6:44: warning: cannot select font 'L'
troff:ipa.6:44: warning: cannot select font 'L'
troff:ipa.6:44: warning: cannot select font 'L'
troff:ipa.6:45: warning: cannot select font 'L'
troff:ipa.6:45: warning: cannot select font 'L'
troff:ipa.6:45: warning: cannot select font 'L'
troff:ipa.6:45: warning: cannot select font 'L'
troff:ipa.6:45: warning: cannot select font 'L'
troff:ipa.6:46: warning: cannot select font 'L'
troff:ipa.6:46: warning: cannot select font 'L'
troff:ipa.6:46: warning: cannot select font 'L'
troff:ipa.6:46: warning: cannot select font 'L'
troff:ipa.6:46: warning: cannot select font 'L'
troff:ipa.6:47: warning: cannot select font 'L'
troff:ipa.6:47: warning: cannot select font 'L'
troff:ipa.6:47: warning: cannot select font 'L'
troff:ipa.6:47: warning: cannot select font 'L'
troff:ipa.6:47: warning: cannot select font 'L'
troff:ipa.6:48: warning: cannot select font 'L'
troff:ipa.6:48: warning: cannot select font 'L'
troff:ipa.6:48: warning: cannot select font 'L'
troff:ipa.6:48: warning: cannot select font 'L'
troff:ipa.6:48: warning: cannot select font 'L'
troff:ipa.6:49: warning: cannot select font 'L'
troff:ipa.6:49: warning: cannot select font 'L'
troff:ipa.6:49: warning: cannot select font 'L'
troff:ipa.6:49: warning: cannot select font 'L'
troff:ipa.6:49: warning: cannot select font 'L'
troff:ipa.6:50: warning: cannot select font 'L'
troff:ipa.6:50: warning: cannot select font 'L'
troff:ipa.6:50: warning: cannot select font 'L'
troff:ipa.6:50: warning: cannot select font 'L'
troff:ipa.6:50: warning: cannot select font 'L'
troff:ipa.6:51: warning: cannot select font 'L'
troff:ipa.6:51: warning: cannot select font 'L'
troff:ipa.6:51: warning: cannot select font 'L'
troff:ipa.6:51: warning: cannot select font 'L'
troff:ipa.6:51: warning: cannot select font 'L'
troff:ipa.6:52: warning: cannot select font 'L'
troff:ipa.6:52: warning: cannot select font 'L'
troff:ipa.6:52: warning: cannot select font 'L'
troff:ipa.6:52: warning: cannot select font 'L'
troff:ipa.6:52: warning: cannot select font 'L'
troff:ipa.6:53: warning: cannot select font 'L'
troff:ipa.6:53: warning: cannot select font 'L'
troff:ipa.6:53: warning: cannot select font 'L'
troff:ipa.6:53: warning: cannot select font 'L'
troff:ipa.6:53: warning: cannot select font 'L'
troff:ipa.6:54: warning: cannot select font 'L'
troff:ipa.6:39: warning: cannot select font 'L'
troff:ipa.6:39: warning: cannot select font 'L'
troff:ipa.6:39: warning: cannot select font 'L'
troff:ipa.6:39: warning: cannot select font 'L'
troff:ipa.6:39: warning: cannot select font 'L'
troff:ipa.6:40: warning: cannot select font 'L'
troff:ipa.6:40: warning: cannot select font 'L'
troff:ipa.6:40: warning: cannot select font 'L'
troff:ipa.6:40: warning: cannot select font 'L'
troff:ipa.6:40: warning: cannot select font 'L'
troff:ipa.6:41: warning: cannot select font 'L'
troff:ipa.6:41: warning: cannot select font 'L'
troff:ipa.6:41: warning: cannot select font 'L'
troff:ipa.6:41: warning: cannot select font 'L'
troff:ipa.6:41: warning: cannot select font 'L'
troff:ipa.6:42: warning: cannot select font 'L'
troff:ipa.6:42: warning: cannot select font 'L'
troff:ipa.6:42: warning: cannot select font 'L'
troff:ipa.6:42: warning: cannot select font 'L'
troff:ipa.6:42: warning: cannot select font 'L'
troff:ipa.6:43: warning: cannot select font 'L'
troff:ipa.6:43: warning: cannot select font 'L'
troff:ipa.6:43: warning: cannot select font 'L'
troff:ipa.6:43: warning: cannot select font 'L'
troff:ipa.6:43: warning: cannot select font 'L'
troff:ipa.6:44: warning: cannot select font 'L'
troff:ipa.6:44: warning: cannot select font 'L'
troff:ipa.6:44: warning: cannot select font 'L'
troff:ipa.6:44: warning: cannot select font 'L'
troff:ipa.6:44: warning: cannot select font 'L'
troff:ipa.6:45: warning: cannot select font 'L'
troff:ipa.6:45: warning: cannot select font 'L'
troff:ipa.6:45: warning: cannot select font 'L'
troff:ipa.6:45: warning: cannot select font 'L'
troff:ipa.6:45: warning: cannot select font 'L'
troff:ipa.6:46: warning: cannot select font 'L'
troff:ipa.6:46: warning: cannot select font 'L'
troff:ipa.6:46: warning: cannot select font 'L'
troff:ipa.6:46: warning: cannot select font 'L'
troff:ipa.6:46: warning: cannot select font 'L'
troff:ipa.6:47: warning: cannot select font 'L'
troff:ipa.6:47: warning: cannot select font 'L'
troff:ipa.6:47: warning: cannot select font 'L'
troff:ipa.6:47: warning: cannot select font 'L'
troff:ipa.6:47: warning: cannot select font 'L'
troff:ipa.6:48: warning: cannot select font 'L'
troff:ipa.6:48: warning: cannot select font 'L'
troff:ipa.6:48: warning: cannot select font 'L'
troff:ipa.6:48: warning: cannot select font 'L'
troff:ipa.6:48: warning: cannot select font 'L'
troff:ipa.6:49: warning: cannot select font 'L'
troff:ipa.6:49: warning: cannot select font 'L'
troff:ipa.6:49: warning: cannot select font 'L'
troff:ipa.6:49: warning: cannot select font 'L'
troff:ipa.6:49: warning: cannot select font 'L'
troff:ipa.6:50: warning: cannot select font 'L'
troff:ipa.6:50: warning: cannot select font 'L'
troff:ipa.6:50: warning: cannot select font 'L'
troff:ipa.6:50: warning: cannot select font 'L'
troff:ipa.6:50: warning: cannot select font 'L'
troff:ipa.6:51: warning: cannot select font 'L'
troff:ipa.6:51: warning: cannot select font 'L'
troff:ipa.6:51: warning: cannot select font 'L'
troff:ipa.6:51: warning: cannot select font 'L'
troff:ipa.6:51: warning: cannot select font 'L'
troff:ipa.6:52: warning: cannot select font 'L'
troff:ipa.6:52: warning: cannot select font 'L'
troff:ipa.6:52: warning: cannot select font 'L'
troff:ipa.6:52: warning: cannot select font 'L'
troff:ipa.6:52: warning: cannot select font 'L'
troff:ipa.6:53: warning: cannot select font 'L'
troff:ipa.6:53: warning: cannot select font 'L'
troff:ipa.6:53: warning: cannot select font 'L'
troff:ipa.6:53: warning: cannot select font 'L'
troff:ipa.6:53: warning: cannot select font 'L'
troff:ipa.6:54: warning: cannot select font 'L'
troff:mbits.6:0: error: cannot open '/usr/lib/tmac/tmac.bits': No such file or directory
troff:mbits.6:68: error: system command execution request is not allowed in safer mode
troff:mbits.6:72: error: system command execution request is not allowed in safer mode
troff:mbits.6:76: error: system command execution request is not allowed in safer mode
troff:mbits.6:80: error: system command execution request is not allowed in safer mode
troff:mbits.6:84: error: system command execution request is not allowed in safer mode
troff:mbits.6:88: error: system command execution request is not allowed in safer mode
troff:mbits.6:95: error: system command execution request is not allowed in safer mode
troff:av.7:102: warning: expected numeric expression, got character 'u'
troff:scat.7:21: warning: name 'RB(' not defined (possibly missing space after 'RB')
troff:fstat.8:37: error: cannot clear diversion trap when not diverting output
troff:mkfs.8:118: warning: name 'sp5' not defined (possibly missing space after 'sp')
troff:svcmgr.8:316: warning: name '..SH' not defined (possibly missing space after '..')
troff:svcmgr.8:317: warning: name '..to' not defined (possibly missing space after '..')
troff:upas.8:208: warning: name 'EX#' not defined (possibly missing space after 'EX')
an.tmac:blitblt.9:27: warning: cannot nest .TP or .TQ inside .TP; supply a tag
troff:faced.9:26: warning: font name 'H' is deprecated
troff:jioctl.9:61: warning: expected numeric expression, got character '"'
troff:mds.10:62: warning: expected numeric expression, got character ','

[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux