Re: [PATCH v2] man*/: ffix (migrate to `MR`)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Alex,

At 2023-07-31T23:47:50+0200, Alejandro Colomar wrote:
> > When the text of all Linux man-pages documents (excluding those
> > containing only `so` requests) is dumped, with adjustment mode 'l'
> > ("-dAD=l") and automatic hyphenation disabled ("-rHY=0") before and
> > after this change, there is no change to rendered output.
> 
> It would be interesting to see a script that corroborates the above
> paragraph.  It might help other projects that may want to migrate to
> MR.

Sure.  I used a couple of scripts.

  $ cat ATTIC/dump-pages.sh
  #!/bin/sh

  pages=$(grep -L '^\.so ' man*/* | sort)
  groff -t "$@" -m andoc -T utf8 -P -cbou $pages

  $ cat ATTIC/dump-pages-left-adjustment-no-hyphenation.sh
  #!/bin/sh

  pages=$(grep -L '^\.so ' man*/* | sort)
  groff -t -dAD=l -rHY=0 -m andoc -T utf8 -P -cbou $pages

And here's how I ran them.

  sh ATTIC/dump-pages.sh >| DUMP1
  sed -i -f ./ATTIC/MR.sed $(grep -L '^\.so ' man*/*)
  sh ATTIC/dump-pages-left-adjustment-no-hyphenation.sh >| DUMP2
  diff -U0 -b DUMP1 DUMP2 | less -R

That confirmed that there were "no changes" (with the caveat noted
above).

  sh ATTIC/dump-pages.sh >| DUMP2
  diff -U0 -b DUMP1 DUMP2 | less -R
  diff -U0 -b DUMP1 DUMP2 | wc -l

I used these to eyeball and measure whether there were any formatting
changes even with default adjustment and hyphenation enabled.  It showed
me _tons_ of man page names no longer getting broken (and hyphenated)
across lines, and nothing else that I noticed.

With the previous empty diff in hand, I decided that I hadn't regressed
the text of the pages.

If there are further sanity checks we can apply, I'm open to
suggestions.

Since you had me looking at my shell history, I'll share that I did a
"git co ." (co = alias for "checkout") 18 times in the course of
developing MR.sed.  Those drove most of my recent patch submissions
immediately prior to this one.  I could have done 18 more without
fatiguing (albeit not necessrily without frustration with myself for not
getting my sed right).  But that's the beauty of sed, and
Bash/readline's "reverse-search-history" and "operate-and-get-next"
features.

As it turned out, my sed was pretty good, except for the missing use
case you identified, and my fix for which worked on the first try.  The
irregularity of the page inputs was the tricky bit.

At one point I had a fearful episode that I'd misdesigned `MR` for one
scenario, and much like the Master being terrorized by the Keller
Machine, I had visions of the Doctor (Ingo Schwarze) laughing at me and
telling me he told me so and winning the whole world over to mdoc(7) in
one stroke.  But it was fine (attached).

There are _still_ some `ad` requests scattered around (outside of tbl(1)
text blocks), but I didn't go after those because they weren't in the
way of my objective.  Eventually it'd be good to scrub those too.

> > I prepared this change with the following GNU sed script.
> > 
> > \# Handle simplest cases: ".BR foo (1)" and ".IR foo (1)".
> 
> What I do to avoid git messing with these comments is to write a
> leading space.  For git, only '#' in column 1 are special.  Since most
> compilers and interpreters allow a space before a commented line, a
> leading space is fine.

Ahh.  A leading backslash is the only workaround I've ever noticed.

> I've edited the commit message to have spaces, so that it's directly
> pastable into a MR.sed script.  Oh, and I included "$ cat MR.sed;" in
> the commit message; I couldn't not do it.  :)

No worries. :)

> I've applied the patch (or rather, the script), but won't push it yet.
> If you send a run of commands that prove no differences before and
> after, I'll amend the commit message with it.

Please do verify it yourself with the tools above (or better ones).  I'm
well aware that this is a huge change that can make people nervous.

Regards,
Branden

Attachment: try-to-break-MR.man
Description: Unix manual page

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux