Hi Helge, At 2022-03-13T13:34:22+0100, Helge Kreutzmann wrote: > Without further ado, the following was found: > > Issue: In the right table, please add \& markup for end of sentence characters (? ! .) to get proper formatting in other locales. Thanks! Specifically, what happens is that if the additional inter-sentence space amount (set with the `ss` request) is not the same as the inter-word space amount, the columnation of this "table" (not a tbl(1) table) gets thrown off. This is an area that has seen significant clarification in the groff Texinfo manual and other documentation since the 1.22.4 release, so I ask the reader's indulgence while I quote it. -- Request: .ss word-space-size [additional-sentence-space-size] -- Register: \n[.ss] -- Register: \n[.sss] Set the sizes of spaces between words and sentences.(1) (*note Manipulating Filling and Adjustment-Footnote-1::) Their units are twelfths of the space width of the current font. Initially both the WORD-SPACE-SIZE and ADDITIONAL-SENTENCE-SPACE-SIZE are 12. Negative values are not permitted. The request is ignored if there are no arguments. The first argument, the inter-word space size, is a minimum; if an output line undergoes adjustment, such spaces may increase in width. The optional second argument sets the amount of additional space separating sentences on the same output line. If omitted, this amount is set to WORD-SPACE-SIZE. The read-only registers '.ss' and '.sss' hold the values of minimal inter-word space and additional inter-sentence space, respectively. These parameters are associated with the environment (*note Environments::), and rounded down to the nearest multiple of 12 on terminal output devices. Additional inter-sentence spacing is used only if the output line is not full when the end of a sentence occurs in the input. If a sentence ends at the end of an input line, then both an inter-word space and an inter-sentence space are added to the output; if two spaces follow the end of a sentence in the middle of an input line, then the second space becomes an inter-sentence space in the output. Additional inter-sentence space is not adjusted, but the inter-word space that always precedes it may be. Further input spaces after the second, if present, are adjusted as normal. [...] (1) *Note Filling:: and *note Sentences:: for the definitions of word and sentence boundaries, respectively. > " 2 3 4 5 6 7 30 40 50 60 70 80 90 100 110 120\n" > " ------------- ---------------------------------\n" > "0: 0 @ P \\` p 0: ( 2 E<lt> F P Z d n x\n" > "1: ! 1 A Q a q 1: ) 3 = G Q [ e o y\n" > "2: \" 2 B R b r 2: * 4 E<gt> H R \\e f p z\n" > "3: # 3 C S c s 3: ! + 5 ? I S ] g q {\n" > "4: $ 4 D T d t 4: \" , 6 @ J T \\(ha h r |\n" > "5: % 5 E U e u 5: # - 7 A K U _ i s }\n" > "6: & 6 F V f v 6: $ . 8 B L V \\` j t \\(ti\n" > "7: \\(aq 7 G W g w 7: % / 9 C M W a k u DEL\n" > "8: ( 8 H X h x 8: & 0 : D N X b l v\n" > "9: ) 9 I Y i y 9: \\(aq 1 ; E O Y c m w\n" > "A: * : J Z j z\n" > "B: + ; K [ k {\n" > "C: , E<lt> L \\e l |\n" > "D: - = M ] m }\n" > "E: . E<gt> N \\(ha n \\(ti\n" > "F: / ? O _ o DEL\n" The piece of ascii(7) quoted above renders as expected if none of the groff localization macro files are loaded, and if the user/administrator has not changed the additional inter-sentence space amount in "troffrc" or "man.local"--but doing so is supported. A common preference, and one shared by the Czech, German, French, Italian[1], and Swedish groff localization files, is to set additional inter-sentence space to zero with `.ss 12 0`. Here is the result. Tables │ For convenience, below are more compact tables in hex and decimal. 2 3 4 5 6 7 30 40 50 60 70 80 90 100 110 120 ------------- --------------------------------- 0: 0 @ P ` p 0: ( 2 < F P Z d n x 1: ! 1 A Q a q 1: ) 3 = G Q [ e o y 2: " 2 B R b r 2: * 4 > H R \ f p z 3: # 3 C S c s 3: ! + 5 ? I S ] g q { 4: $ 4 D T d t 4: " , 6 @ J T ^ h r | 5: % 5 E U e u 5: # - 7 A K U _ i s } 6: & 6 F V f v 6: $ . 8 B L V ` j t ~ 7: ' 7 G W g w 7: % / 9 C M W a k u DEL 8: ( 8 H X h x 8: & 0 : D N X b l v 9: ) 9 I Y i y 9: ' 1 ; E O Y c m w A: * : J Z j z B: + ; K [ k { C: , < L \ l | D: - = M ] m } E: . > N ^ n ~ F: / ? O _ o DEL (Yes, there is a stray pipe symbol on the same line as the subsection heading.[2]) I've confirmed that Helge's solution works. In principle, it is fragile to locales that have other sentence-ending characters, but I know of no such locales--none are extant in groff, pending, or requested. Therefore I'm +1 on this. Perhaps better changes would be to (1) have the Linux man-pages start using groff's EX/EE macros for this and (2) change groff's EX/EE macros to start doing what everyone already thinks they do, and shut off additional inter-sentence space (temporarily). These would be supplemental to the existing proposed fix. Having the additional `\&` escape sequences will cause no harm, and might be salutary examples. I noticed just last night that the iso-8859*(7) man pages have a much worse problem; they use raw 8-bit characters in the input, which leads to UTF-8 mojibake and/or confusing and incorrect character names for the glyphs that appear when you render one ISO 8859 encoding's page on another. (man-db man(1) hides this problem, possibly by using its manconv(1) utility--but man pages should be written so that troff -man works.) The correct thing to do is use groff special character escape sequences; these _name_ the desired glyph and are more robust to character encoding conversions (albeit requiring use of preconv(1)). Anyone have thoughts on any of the above? Regards, Branden [1] forthcoming in groff 1.23 [2] This appears to be because the preceding tbl(1) table is too wide for 78 columns. I'll have a look and see if I can tweak it. Or this may be a tbl(1) bug; several have been fixed over the past couple of years[3]. [3] https://savannah.gnu.org/bugs/index.php?go_report=Apply&group=groff&func=&set=custom&msort=0&report_id=101&advsrch=0&status_id=3&resolution_id=1&submitted_by=0&assigned_to=0&category_id=109&bug_group_id=0&severity=0&summary=&details=&sumORdet=0&history_search=0&history_field=0&history_event=modified&history_date_dayfd=14&history_date_monthfd=3&history_date_yearfd=2022&chunksz=50&spamscore=5&boxoptionwanted=1#options
Attachment:
signature.asc
Description: PGP signature