Hi Branden, On 2023-08-14 02:18, G. Branden Robinson wrote: > Use the man(7) macro `MR`, new to groff 1.23.0, instead of font style > alternation macros to mark up man page cross reference. > > Depending on your configuration of groff man(7), this change may also > alter the typeface that is used to mark up man page topic names (that > is, the "ls" in "ls(1)". groff by default sets these italics (which > often appear underlined on terminals), in concord with the original AT&T > Unix troff man(7) implementation in 1979. A motivational excursus is > available.[1] To change this typeface selection, see the end of this > commit message. > > Background (from the groff 1.23.0 release announcement and "NEWS" file): > > o The an (man) macro package supports a new macro, `MR`, intended for > use by man page cross references in preference to the font style > alternation macros historically used. Where before you would write > .BR ls (1). > or > .IR ls (1). > you should now write > .MR ls 1 . > (the third argument, typically used for trailing punctuation, is > optional). Because the macro semantically identifies a man page, it > can create a clickable hyperlink ("man:ls(1)" for the above example) > on supporting devices. Furthermore, a new string, `MF`, defines the > font to be used for setting the man page topic (the first argument to > `MR` and `TH`), permitting configuration by distributions, sites, and > users. > > Inclusion of the `MR` macro was prompted by its introduction to > Plan 9 from User Space's troff in August 2020. Its purpose is to > ameliorate several long-standing problems with man page cross > references: (1) the package's lack of inherent hyperlink support for > them; (2) false-positive identification of strings resembling man page > cross references (as can happen with "exit(1)", "while(1)", > "sleep(5)", "time(0)" and others) by terminal emulators and other > programs; (3) the unwanted intrusion of hyphens into man page topics, > which frustrates copy-and-paste operations (this problem has always > been avoidable through use of the \% escape sequence, but cross > references are frequent in man pages and some page authors are > inexpert *roff users); and (4) deep divisions in man page maintenance > communities over which typeface should be used to set the man page > topic (italics, roman, or bold). > [...] > o The an (man) macro package can now produce clickable hyperlinks within > terminal emulators, using the OSC 8 support added to grotty(1) (see > below). The groff man(7) extension macros `UR` and `MT`, present > since 2007, expose this feature. At present the feature is disabled > by default in `man.local` pending more widespread recognition of OSC 8 > sequences in pager programs. The package now recognizes a `U` > register to enable hyperlinks in any output driver supporting them. > > Use a command like > printf '\033]8;;man:grotty(1)\033\\grotty(1)\033]8;;\033\\\n' | more > to check your terminal and pager for OSC 8 support. If you see > "grotty(1)" and no additional garbage characters, then you may wish to > edit "man.local" to remove the lines that disable this feature. > > When the text of all Linux man-pages documents (excluding those > containing only `so` requests) is dumped, with adjustment mode 'l' > ("-dAD=l") and automatic hyphenation disabled ("-rHY=0") before and > after this change, there is no change to rendered output. > > When automatic hyphenation is enabled, this change suppresses > hyphenation of approximately 3,100+ man page names when cross referenced > when using the default terminal width of 80 (meaning that the text > is formatted for a line length of 78 for historical reasons). > > I prepared this change with the following GNU sed script. > > \# Handle simplest cases: ".BR foo (1)" and ".IR foo (1)". > s/^.[BI]R \(\\%\)*\([.@_[:alnum:]\\-]\+\) (\([1-9a-z]\+\))$/.MR \2 \3/ > \# Handle case: trailing punctuation, as in ".IR foo (1),". > s/^.[BI]R \(\\%\)*\([.@_[:alnum:]\\-]\+\) (\([1-9a-z]\+\))\([^[:space:]]\+\)$/.MR \2 \3 \4/ > \# Handle case: leading punctuation, as in ".RI ( foo (1)". > s/^.R[BI] \(\\%\)*\([^[:space:]]\+\) \([.@_[:alnum:]\\-]\+\) (\([1-9a-z]\+\))\([^[:space:]]\+\)$/\\%\2\\c\n.MR \3 \4 \5/ > \# Handle case: 3rd+ arguments or trailing comments. This case is rare > \# and will require manual fixup if there are 4+ arguments to MR. Use > \# groff -man -rCHECKSTYLE=1 to have them automatically reported. > s/^.[BI]R \(\\%\)*\([.@_[:alnum:]\\-]\+\) (\([1-8a-z]\+\))\( .*\)/.MR \2 \3\4/ > > Confirmed no errors arising in `MR` argument count as follows. > > $ groff --version | head -n 1 > GNU groff version 1.23.0 > $ groff -z -t -rCHECKSTYLE=1 -m andoc -T utf8 -P -cbou \ > $(grep -L '^\.so ' man*/* | sort) 2>&1 | grep MR | grep . \ > || echo "IT'S CLEAN" > IT'S CLEAN > > To get the man page topic names to render in bold again as the Linux > man-pages have historically done, set the *roff "MF" string to "B". > > 1. man-db man(1) supports an environment variable for passing options These seem to be alternatives. For alternatives, I prefer letters instead of numbers. See man-pages(7): $ MANWIDTH=72 man man-pages | sed -n '/^ Lists/,/^ [^ ]/p' | head -n-1; Lists There are different kinds of lists: [...] Ordered lists Elements are preceded by a number in parentheses (1), (2). These represent a set of steps that have an order. When there are substeps, they will be numbered like (4.2). [...] Alternatives list Elements are preceded by a letter in parentheses (a), (b). These represent a set of (normally) exclusive alterna‐ tives. [...] There should always be exactly 2 spaces between the list symbol and the elements. This doesn’t apply to "tagged paragraphs", which use the default indentation rules. > to the formatter. > > MANROFFOPT="-dMF=B" > > You might wish to set this in your shell startup file and export the > variable. > > 2. When rendering pages directly with groff, nroff, or troff, you can > set the string on the command line. > > nroff -dMF=B -mandoc man1/getent.1 > > 3. You can set this string in groff man(7)'s site-local configuration > file. Its location depends on groff's build-time parameters, but is > documented in the groff_man(7) page. On Debian-based systems, it's > in /etc/groff/man.local. Add the following line (with no leading > spaces). > > .ds MF B\" > > (The trailing '\"' is a safety measure.[2]) > > [1] https://lore.kernel.org/linux-man/20230803175738.dqpxy3dirl3bpznv@illithid/T/#u > [2] https://www.gnu.org/software/groff/manual/groff.html.node/Strings.html You missed showing proof that this patch is good. I included your old tests in the commit message in the 'MR' branch. Please perform them and show them here. Here's what I included: On 2023-08-01 00:50, G. Branden Robinson wrote: > I used a couple of scripts. > > $ cat ATTIC/dump-pages.sh > #!/bin/sh > > pages=$(grep -L '^\.so ' man*/* | sort) > groff -t "$@" -m andoc -T utf8 -P -cbou $pages > > $ cat ATTIC/dump-pages-left-adjustment-no-hyphenation.sh > #!/bin/sh > > pages=$(grep -L '^\.so ' man*/* | sort) > groff -t -dAD=l -rHY=0 -m andoc -T utf8 -P -cbou $pages > > And here's how I ran them. > > sh ATTIC/dump-pages.sh >| DUMP1 > sed -i -f ./ATTIC/MR.sed $(grep -L '^\.so ' man*/*) > sh ATTIC/dump-pages-left-adjustment-no-hyphenation.sh >| DUMP2 > diff -U0 -b DUMP1 DUMP2 | less -R > > That confirmed that there were "no changes" (with the caveat noted > above). > > sh ATTIC/dump-pages.sh >| DUMP2 > diff -U0 -b DUMP1 DUMP2 | less -R > diff -U0 -b DUMP1 DUMP2 | wc -l > > I used these to eyeball and measure whether there were any formatting > changes even with default adjustment and hyphenation enabled. It showed > me _tons_ of man page names no longer getting broken (and hyphenated) > across lines, and nothing else that I noticed. > > With the previous empty diff in hand, I decided that I hadn't regressed > the text of the pages. > > Signed-off-by: "G. Branden Robinson" <g.branden.robinson@xxxxxxxxx> [Jakub has concerns that groff-1.23.0 was released too recently] Nacked-by: Jakub Wilk <jwilk@xxxxxxxxx> Cheers, Alex > --- > > v3: Add notice about expected typeface change in man page cross > references. Explain how to configure it. > > [Alex: I'm sending this out via Neomutt, and it _says_ this part of the > message is "text/plain; us-ascii". If you receive it quoted-printable, > be advised that the equals signs in the foregoing are likely corrupted.] If I open the email source in Thunderbird, I see things like this: MANROFFOPT=3D"-dMF=3DB" I wouldn't have expected neomutt(1) to fuck up emails like that! I mean, I assume that in Thunderbird, but not in neomutt(1). Could you send attached an encrypted file that no program has the right to mess? (I still want the inline message, for discussion, but I'll apply from the encrypted one.) Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature