Hi Alex, At 2022-07-19T14:17:15+0200, Alejandro Colomar wrote: > Hi, наб and Branden! I'm not exactly sure what you wanted me to comment on in this patch submission. Keep in mind that I am a bear of little brain--please be clear what it is you're asking of me. ;-) I will assume that it is my *roff/man(7) expertise (such as it is) and will respond on that assumption. I will also comment on English usage because I can't help myself. > > diff --git a/man3/tm.3type b/man3/tm.3type Oh, bother. Bash autocompletion for "man" on my Debian bullseye is too dumb to recognize this new man page suffix. I trust someone reading this is aware of the problem and is fixing it for the next Debian release. (Has someone filed this as a bug with the Debian BTS?) Other distributions may have similar concerns. > > index 1931d890d..8b6f8d9bf 100644 > > --- a/man3/tm.3type > > +++ b/man3/tm.3type > > @@ -25,8 +25,26 @@ Standard C library > > .BR " int tm_yday;" \ > > " /* Day of the year [" 0 ", " 365 "] (Jan/01 = " 0 ") */" > > .BR " int tm_isdst;" " /* Daylight savings flag */" > > + > > +.BR " long tm_gmtoff;" " /* Seconds East of UTC */" > > +.BR " char*tm_zone;" " /* Timezone abbreviation */" > > Please add cosmetic whitespace (at least 1 for every member, possibly > 2, depending on your taste) :) Hmmm. I'm attaching a screenshot of Okular's rendering of the current state of tm(3type) in the Linux man-pages Git repository to PostScript. Recall the advice in groff's Texinfo manual. This is from groff 1.22.4. 5.1.6 Input Conventions ----------------------- [...] * Do not try to do any formatting in a WYSIWYG manner (i.e., don't try using spaces to get proper indentation). Synopses in man pages, whether for section [168] commands or section [23] C function calls or data types, are not typically set in a monospaced typeface, nor do I think they should be. A proportional typeface generally looks better. The price of that improved appearance is that the use of sequences of spaces to get columnar alignment breaks as soon as there is variation in the content. The traditional solution to this problem in the *roff language is to set tab stops. However, man-pages(7) calls out tab stop manipulation as unportable man(7) usage. * Example programs should be laid out according to Kernighan and Ritchie style, with 4‐space indents. (Avoid the use of TAB characters in source code!) Now, section 2 and 3 synopses are not _example program_ source code, so a defense of tab usage could be made here, but a man page author simply trying to get their stuff documented could be forgiven for feeling that drawing such a distinction is hair-splitting. Using spaces is, however, in my opinion, worse simply due to the effect on rendered output for everything that isn't a terminal. There are a few ways to address this issue. A. Don't worry about it and let HTML/PostScript/PDF output look ugly. B. Stick synopses, at least for section 2 and 3 man pages, in EX/EE blocks, which switch the typeface to Courier on typesetting output devices (which includes HTML if the groff project fixes grohtml to change font families--it's _supposed_ to, but something broke a long time ago). My recollection is that Michael Kerrisk opposed this practice. I too don't think it's a great idea; the average glyph width is lower in proportional fonts, so using it, you can fit more content on an output line. C. Use tabs anyway. For results that will actually get what you want, you will need to set the tab stops to ensure they're wide enough to achieve the desired alignment. The use of custom tab stops requires invoking the `ta` request, and this is warned against in the "Portability" section of groff_man(7) (to be part of groff_man_style(7) in groff 1.23). But by invoking the `nf` and `fi` requests for other reasons, this project's pages have already crossed that bridge. C1. Actually selecting values for the tab stops can be tedious. You can hard-code measurements, but it will be hard to maintain consistency among contributors (will you use ens, ems, inches, or centimeters as the scaling unit?) and, much worse, the size of the rendered typeface can vary. groff_man(7) explicitly countenances selection of a 10-, 11-, or 12-point typefaces. At present, no means of changing the default font family for body text is exposed, but it might be in the future. So I expect the temptation will be to set tab stops for 10-point Times (but see below), which will lead to ugly results for other family/size selections. C2. Clever roff writers (sometimes too clever) reach for the \w escape sequence to overcome this problem. So instead of hard-coding tab stop lengths, they have the formatter compute them based on sample inputs. For the page under discussion, this practice would lead to requests that look like the following. .ta \w'char*' \w'tm_gmtoff' What's happening here is that the "longest" item within each tab stop is getting its length computed, and those computed lengths used as the tab stop values. In practice this won't quite do because it will leave no space between the items in the event the same row has two of the longest column entries adjacent, so you more often see something like this. .ta \w'char*'+1n \w'tm_gmtoff'+1n This ensures that the tab stops have extra one "en" of space between them. It doesn't suck, but at this point your man page renderer needs to be sophisticated enough to include an arithmetic expression evaluator. This provokes grumbles from folks like Ingo who maintain non-roff man page formatters. It is true that we could add a macro to man(7) that conceals a bit of this complexity. Like this. .TA char gmtoff This certainly looks much cleaner, and in fact it closely resembles Texinfo's @multitable command. But it is just a mask over the `ta` request of frightening appearance above, not a silver bullet. C3. The above has the problem that it relies upon the writer to know which pieces of text between the tab stops are the longest. This sounds like an obvious thing that no one would ever screw up. I think that assumption would be swiftly overturned. There are two big problems. The first is maintenance. Considering potential applications in Linux man-pages, you will often have situations where someone adds a new function or struct member to a synopsis. A contributor may already be at the limit of their man(7) knowledge. They may not look far enough up the page to see the `ta` request, may not understand it, and may not think to consider that they've just added a new longest item, and thus need to update that `ta` request. Because that request may be outside the scope of the diff context, it will be easy for reviewers to overlook, too. The other issue is more subtle. I predict that contributors are likely to reckon widths in terms of character cells, not the horizontal measurement of rendered text. Because a proportional font is used for rendering, the results can be surprising. $ groff .nr m \w'mmm' .nr i \w'iiii' .tm m=\nm, i=\ni m=23340, i=11120 In 10-point Times, "mmm" is over twice as wide as "iiii". I dare say few man page contributors are going to think of this. Not having Times roman's font metrics and a full adder operating in their heads when they're thinking about documenting an API, they will frequently fail to correctly select the "longest" content within a particular tab stop for an argument to \w in a `ta` request. Sorting this kind of thing out is a pain. Why don't we have something that recognizes when we're using a series of lines with tabs, then reads them all and computes the tab stops necessary to separate them nicely? D. Congratulations, you've discovered tbl(1).[1] I guess my advice is to choose your poison. I'll advise as best I can. > I tend to prefer the em dash to be next to (no whitespace) the > enclosed clause. That makes it easier to mentally associate (as in a > set of parentheses) to the clause. I'm not sure if it's a thing of > mine, or if it's standard practise? "Spacing around an em dash varies. Most newspapers insert a space before and after the dash, and many popular magazines do the same, but most books and journals omit spacing, closing whatever comes before and after the em dash right up next to it. This website prefers the latter, its style requiring the closely held em dash in running text." https://www.merriam-webster.com/words-at-play/em-dash-en-dash-how-to-use In the groff man pages, I too "close up" any space around em dashes, but I freely admit that this (1) doesn't look all that great in terminal rendering [it too closely resembles other dashes--a "fullwidth" dash taking two character cells would be preferable on purely esthetic grounds, and probably a nightmare to get terminal emulators to cope with] and (2) it frustrates my input style; since I don't want to use the `\c` escape sequence, I end up putting the words immediately outside the em-dashed aside on the "wrong" lines semantically. Maybe I should just get over my allergy to `\c` now that I understand how it works.[citation needed] > What is "&a."? Is that documented somewhere? I didn't know that > abbreviature. Having seen наб's reply, it seems of a piece with "&c.", which was in English formerly (ca. 150 years ago) a common abbreviation for the Latin "et cetera". Nowadays "etc." has fully supplanted "&c." while many native English speakers are shaky on what, exactly, it abbreviates, even spelling it "ect." because that better aligns with English language phonotactics. I admit never having seen "&a." before in English writing. Like Germans' use of "resp.", it may be a thing non-native speakers assume "ports" into English, but doesn't. Regards, Branden [1] https://git.savannah.gnu.org/cgit/groff.git/tree/src/preproc/tbl/tbl.1.man
Attachment:
tm.3type.ps.png
Description: PNG image
Attachment:
signature.asc
Description: PGP signature