Hi Paul, At 2022-11-25T18:31:02-0800, Paul Eggert wrote: > On 2022-11-23 10:43, Paul Eggert wrote: > > I installed that > Further testing showed that the installed patch doesn't work with > traditional troff, which doesn't support groff escape sequences like > \(aq. I think this patch goes too far in the retrograde direction. \(xx, where xx is any two characters, is not a groff extension. It comes from Ossanna troff all the way back in the mid-1970s. It is a special character escape sequence; a groff way of spelling it is \[xxx] where xxx can be of any nonzero length (but cannot contain a closing square bracket). The repertoire of supported special character identifiers varies by implementation and, after Kernighan's rewrite of troff circa 1980 for device-independence, by output device. Nevertheless, for portability/backward compatibility, a set of them are very widely supported. These include three that your patch takes out, \(ha, \(ga, and \(ti. Replacing these with ASCII characters will _not_ produce correct typography on typesetting output devices. I would attach scans of Tables I and II from "NROFF/TROFF User's Manual", the version dated 1976, published with Volume 2 of the Unix Programmer's Manual (1979), and reprinted by Holt, Reinhart, and Winston in 1983, but the linux-man list rejects all attachments bigger than a breadbox, so I will ask for your trust (or ask me for it privately). Those tables illustrate the glyph repertoire of Ossanna troff and the special character identifiers that were implemented. groff_char(7) from groff 1.22.4 and earlier marks the special character identifiers you can expect to be portable (with "***" in its listings), and for 1.23 I have added a "History" section to the page which addresses most of the thousand questions I've asked over the past few years while trying to learn this stuff. I'll put that in a footnote.[1] > To fix this I installed the equivalent of the attached further patch to > TZDB. I therefore propose the following snippet instead, also taking into account Solaris 10 troff's poor handling of unsupported font selections in nroff. .q + . To allow for future extensions, an unquoted name should not contain characters from the set .ie \n(.g .q \f(CR!$%&\(aq()*,/:;<=>?@[\e]\(ha\(ga{|}\(ti\fP . .el .ie t .q \f(CW!$%&'()*,/:;<=>?@[\e]\(ha\(ga{|}\(ti\fP . . el .q !$%&'()*,/:;<=>?@[\e]\(ha\(ga{|}\(ti . .TP .B FROM Gives the first year in which the rule applies. What do you think? Regards, Branden [1] (Much UTF-8 follows.) History A consideration of the typefaces originally available to AT&T nroff and troff illuminates many conventions that one might regard as idiosyncratic fifty years afterward. (See section “History” of roff(7) for more context.) The face used by the Teletype Model 37 terminals of the Murray Hill Unix Room was based on ASCII, but assigned multiple meanings to several code points, as suggested by that standard. Decimal 34 (") served as a dieresis accent and neutral double quotation mark; decimal 39 (') as an acute accent, apostrophe, and closing (right) single quotation mark; decimal 45 (-) as a hyphen and a minus sign; decimal 94 (^) as a circumflex accent and caret; decimal 96 (`) as a grave accent and opening (left) single quotation mark; and decimal 126 (~) as a tilde accent and (with a half‐line motion) swung dash. The Model 37 bore an optional extended character set offering upright Greek letters and several mathematical symbols; these were documented as early as the kbd(VII) man page of the (First Edition) Unix Programmer’s Manual. At the time Graphic Systems delivered the C/A/T phototypesetter to AT&T, the ASCII character set was not considered a standard basis for a glyph repertoire by traditional typographers. In the stock Times roman, italic, and bold styles available, several ASCII characters were not present at all, nor was most of the Teletype’s extended character set. AT&T commissioned a “special” font to ensure no loss of repertoire. A representation of the coverage of the C/A/T’s text fonts follows. The glyph resembling an underscore is a baseline rule, and that resembling a vertical line is a box rule. In italics, the box rule was not slanted. We also observe that the hyphen and minus sign were already “de‐unified” by the fonts provided; a decision whither to map an input “-” therefore had to be taken. ┌────────────────────────────────────────────────────┐ │A B C D E F G H I J K L M N O P Q R S T U V W X Y Z │ │a b c d e f g h i j k l m n o p q r s t u v w x y z │ │0 1 2 3 4 5 6 7 8 9 fi fl ffi ffl │ │! $ % & ( ) ‘ ’ * + - . , / : ; = ? [ ] │ │ │• □ — ‐ _ ¼ ½ ¾ ° † ′ ¢ ® © │ └────────────────────────────────────────────────────┘ The special font supplied the missing ASCII and Teletype extended glyphs, among several others. The plus, minus, and equals signs appeared in the special font despite availability in text fonts “to insulate the appearance of equations from the choice of standard [read: text] fonts”—a priority since troff was turned to the task of mathematical typesetting as soon as it was developed. We note that AT&T took the opportunity to de‐unify the apostrophe/right single quotation mark from the acute accent (a choice ISO later duplicated in its 8859 series of standards). A slash intended to be mirror‐symmetric with the backslash was also included, as was the Bell System logo; we do not attempt to depict the latter. ┌──────────────────────────────────────────────────────────┐ │α β γ δ ε ζ η θ ι κ λ μ ν ξ ο π ρ σ ς τ υ ϕ χ ψ ω │ │Γ Δ Θ Λ Ξ Π Σ Υ Φ Ψ Ω │ │" ´ \ ^ _ ` ~ / < > { } # @ + − = ∗ │ │≥ ≤ ≡ ≈ ∼ ≠ ↑ ↓ ← → × ÷ ± ∞ ∂ ∇ ¬ ∫ ∝ √ ‾ ∪ ∩ ⊂ ⊃ ⊆ ⊇ ∅ ∈ │ │§ ‡ ☜ ☞ | ○ ⎧ ⎩ ⎫ ⎭ ⎨ ⎬ ⎪ ⌊ ⌋ ⌈ ⌉ │ └──────────────────────────────────────────────────────────┘ One ASCII character as rendered by the Model 37 was apparently abandoned. That device printed decimal 124 (|) as a broken vertical line, like Unicode U+00A6 (¦). No equivalent was available on the C/A/T; the box rule \[br], brace vertical extension \[bv], and “or” operator \[or] were used as contextually appropriate. Devices supported by AT&T device‐independent troff exhibited some differences in glyph detail. For example, on the Autologic APS‐5 phototypesetter, the square \(sq became filled in the Times bold face. [The lowercase Greek letters in the last boxed table above render in italics where feasible; it is not when pasting into a plain text email.]
Attachment:
signature.asc
Description: PGP signature