Hi Branden, On 2023-08-01 00:50, G. Branden Robinson wrote: > Hi Alex, > > At 2023-07-31T23:47:50+0200, Alejandro Colomar wrote: >>> When the text of all Linux man-pages documents (excluding those >>> containing only `so` requests) is dumped, with adjustment mode 'l' >>> ("-dAD=l") and automatic hyphenation disabled ("-rHY=0") before and >>> after this change, there is no change to rendered output. >> >> It would be interesting to see a script that corroborates the above >> paragraph. It might help other projects that may want to migrate to >> MR. > > Sure. I used a couple of scripts. > > $ cat ATTIC/dump-pages.sh > #!/bin/sh > > pages=$(grep -L '^\.so ' man*/* | sort) > groff -t "$@" -m andoc -T utf8 -P -cbou $pages > > $ cat ATTIC/dump-pages-left-adjustment-no-hyphenation.sh > #!/bin/sh > > pages=$(grep -L '^\.so ' man*/* | sort) > groff -t -dAD=l -rHY=0 -m andoc -T utf8 -P -cbou $pages > > And here's how I ran them. > > sh ATTIC/dump-pages.sh >| DUMP1 > sed -i -f ./ATTIC/MR.sed $(grep -L '^\.so ' man*/*) > sh ATTIC/dump-pages-left-adjustment-no-hyphenation.sh >| DUMP2 > diff -U0 -b DUMP1 DUMP2 | less -R > > That confirmed that there were "no changes" (with the caveat noted > above). > > sh ATTIC/dump-pages.sh >| DUMP2 > diff -U0 -b DUMP1 DUMP2 | less -R > diff -U0 -b DUMP1 DUMP2 | wc -l > > I used these to eyeball and measure whether there were any formatting > changes even with default adjustment and hyphenation enabled. It showed > me _tons_ of man page names no longer getting broken (and hyphenated) > across lines, and nothing else that I noticed. > > With the previous empty diff in hand, I decided that I hadn't regressed > the text of the pages. > > If there are further sanity checks we can apply, I'm open to > suggestions. Nah, I eyeballed random samples the diff and it looked good. That, and your extensive tests, make me confident enough. If we screwed anything, we can fix it. The only concern I had some time ago was with code like exit(1), but that should be using italics today, so it shouldn't be a problem. I can't imagine big issues. > > Since you had me looking at my shell history, I'll share that I did a > "git co ." (co = alias for "checkout") 18 times in the course of > developing MR.sed. Those drove most of my recent patch submissions > immediately prior to this one. I could have done 18 more without > fatiguing (albeit not necessrily without frustration with myself for not > getting my sed right). But that's the beauty of sed, and > Bash/readline's "reverse-search-history" and "operate-and-get-next" > features. > > As it turned out, my sed was pretty good, except for the missing use > case you identified, and my fix for which worked on the first try. The > irregularity of the page inputs was the tricky bit. > > At one point I had a fearful episode that I'd misdesigned `MR` for one > scenario, and much like the Master being terrorized by the Keller > Machine, I had visions of the Doctor (Ingo Schwarze) laughing at me and > telling me he told me so and winning the whole world over to mdoc(7) in > one stroke. But it was fine (attached). > > There are _still_ some `ad` requests scattered around (outside of tbl(1) > text blocks), but I didn't go after those because they weren't in the > way of my objective. Eventually it'd be good to scrub those too. > >>> I prepared this change with the following GNU sed script. >>> >>> \# Handle simplest cases: ".BR foo (1)" and ".IR foo (1)". >> >> What I do to avoid git messing with these comments is to write a >> leading space. For git, only '#' in column 1 are special. Since most >> compilers and interpreters allow a space before a commented line, a >> leading space is fine. > > Ahh. A leading backslash is the only workaround I've ever noticed. > >> I've edited the commit message to have spaces, so that it's directly >> pastable into a MR.sed script. Oh, and I included "$ cat MR.sed;" in >> the commit message; I couldn't not do it. :) > > No worries. :) > >> I've applied the patch (or rather, the script), but won't push it yet. >> If you send a run of commands that prove no differences before and >> after, I'll amend the commit message with it. > > Please do verify it yourself with the tools above (or better ones). I'm > well aware that this is a huge change that can make people nervous. I applied the patch, amended the message with a quote from this email, and pushed to the MR branch in my private git repo at <http://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/log/?h=MR>. Oh, and I also removed a few pages from your patch, per CONTRIBUTING guidelines: Notes External and autogenerated pages A few pages come from external sources. Fixes to the pages should really go to the upstream source. tzfile(5), tzselect(8), zdump(8), and zic(8) come from the tz project <https://www.iana.org/time-zones>. bpf-helpers(7) is autogenerated from the Linux kernel sources using scripts. See man-pages commits 53666f6c3 and 19c7f7839 for details. Anyone that wants to check it, feel free to have a look at it. Cheers, Alex > > Regards, > Branden -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature