*roff `\~` support (was: [PATCH 4/6] xattr.7: wfix)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ingo,

At 2022-08-12T16:30:01+0200, Ingo Schwarze wrote:
> G. Branden Robinson wrote on Thu, Aug 11, 2022 at 03:17:14PM -0500:
> > At 2022-08-11T14:48:51+0200, Ingo Schwarze wrote:
> >> Alejandro Colomar wrote on Mon, Aug 01, 2022 at 03:28:03PM +0200:
> 
> >>> I'd like to arrive to some consensus on usage of \~ and '\ '.
> 
> >> In manual pages, always use "\ " and never use "\~", period.
> 
> > This is hugely overstated.
> 
> >> The former is portable and the latter is a GNU extension.
> 
> > ...that is over 30 years old and supported by Heirloom Doctools
> > troff for 17 years now, neatroff for about six, and your mandoc for
> > three.
> 
> Actually, mandoc supports \~ at least since Sep 17 2009:
> https://cvsweb.bsd.lv/mandoc/Attic/chars.in?rev=1.1&content-type=text/x-cvsweb-markup

Whoops!  I regret the error, and will update groff's Texinfo manual to
correct this.

> > plan9port troff doesn't either, and its laudable introduction
> > of a man(7) MR macro notwithstanding, its activity level is
> > not high.
> 
> There are people using Plan 9 for practical work though, they have
> even occasionally posted on the groff and mandoc lists, so that is a
> bit more of a problem.

I have no moral objection to submitting a patch; I don't know my way
around the AT&T troff code base (which Plan 9 troff mostly is) nearly as
well as groff, though, and, as ever, available time is scarce.  But, if
that's what it takes to get this escape sequence de facto standardized,
and no one else will do it, that will move it up the priority queue.

I don't expect full support to be trivial.  I don't think AT&T troff has
a concept of a space that is adjustable but not breakable.  If that
blows out the effort/reward estimate, treating `\~` as a synonym of `\ `
as mandoc does _should_ be trivial.

Yup, it looks like it is.

https://github.com/9fans/plan9port/blob/master/src/cmd/troff/n1.c#L515

> > I would pessimistically assume that most or all proprietary Unix
> > troffs branched off from V7 Unix troff or early device-independent troff
> > (maybe DWB 1.0 troff, ca. 1984 [?, 1]) lack support for `\~`.
> > https://github.com/n-t-roff/Solaris10-ditroff/blob/master/troff/n1.c#L797
> 
> That does sound likely.  As an example, look at Oracle Solaris 11:
> 
>    > uname -a
>   SunOS unstable11s 5.11 11.3 sun4u sparc SUNW,SPARC-Enterprise
>    > printf "a\\\\~b\n" | nroff | head -n 1
>   a~b
>    > printf "a\\\\~b\n" | groff -T ascii | head -n 1
>   a b

Yes.  The rule is, if no semantics are defined for the function selector
(the character after the escape character), then the character is
treated as if it were not escaped.

> > I further note that groff has a long tradition of inclusion in BSD
> > Unix, https://minnie.tuhs.org/cgi-bin/utree.pl
> > ?file=Net2/usr/src/usr.bin/groff/VERSION
> 
> Yes.  Cynthia already considered dropping support for Kernighan's
> troff, but the CSRG vetoed that.  Inclusion of groff wasn't
> controversial even at a time when groff didn't have its own version
> conrol yet.

It seems strange now how revision control ever seemed like a luxury.
For a few years I maintained Debian's XFree86 packages, which had
_megabytes_ of patches on top of upstream, without using SCCS or RCS or
CVS and even without a tool as nice as quilt.

I was completely insane.  On the other hand, it trained me to be pretty
careful.

Eventually, I acquired sanity and started using Subversion.

> Frankly, i have no idea how to estimate the number of actively used
> installations of Plan 9, Solaris (any version), and possibly
> additional commercial systems like AIX and HP-UX, or how to check
> what the latter support.

Users/maintainers of these systems have to get involved and speak up.
There is an unbounded quantity of Russell's Teapots labeled with names
of Unix variants that have gone defunct.

Without evidence, we must assume their numbers are too small to serve as
a gate on development.

That said, it remains polite to document changes that would affect them.

> There might be more systems out there parsing manual pages (not
> necessarily full-featured roff(7) implementations like those
> you listed), but providing specific evidence of such systems
> would likely be my job to back up my advice.  I'm not searching
> for them right now because we already have a few relevant examples.

plan9port's troff seems like the only case for which we have concrete
evidence, and Russ Cox has already been a pleasure to work with.

I don't know that any user of OpenSolaris/Illumos troff has ever spoken
up on the groff mailing list, which in spite of its
implementation-specific name seems to be the water cooler for what
remains of the global *roff community.

The good news is that, both being descended from AT&T troff and, from
what I've seen, neither having been re-architected, if someone comes up
with `\~` support for plan9port troff, I predict that it will be
mergeable into OpenSolaris/Illumos troff without much difficulty.

...especially the trivial `\ ` synonym version discussed above.

> Even authors might disagree which is more important:
> 
>  (1) The typograpical difference between "\~" and "\ "
>      in PDF and PostScript output of manual pages.
> 
>  (2) Correctly rendering whitespace on Plan 9, Solaris,
>      and likely some other systems *at all*, for any output mode.
> 
> I suspect that many would prefer (2) - of course, that claim is hard
> to quantify.

Another thing to consider is how bad the damage to comprehension is if a
tilde shows up in place of a space.

In a prose phrase, it is likely to be distracting and annoying but will
not be a barrier to comprehension.

[from groff_diff(7):]
  For example, if the current font is\~1 and font position\~1 is

In synopses of commands and language features (like *roff requests or
macros), I think anyone already familiar with Unix command lines or
*roff languages, respectively, can still push their way past it, but it
is worse.

[from gdiffmk(1):]
  .RB [ \-a\~\c
  .RB [ \-c\~\c
  .RB [ \-d\~\c
  .RB [ \-x\~\c
  .BI \-a\~ add-mark
  .BI \-c\~ change-mark
  .BI \-d\~ delete-mark
  .BI \-M\~ "mark1 mark2"
  .BI \-x\~ diff-command
  .BI \-x\~ diff-command

[from groff_diff(7):]
  .BI .chop\~ object
  .BI .class\~ "name c1 c2\~"\c
  .BI .close\~ stream
  .BI .composite\~ glyph1\~glyph2
  .BI .color\~ n
  .BI .cp\~ n

The tilde showing up in boldface would be especially disappointing.

On the gripping hand, such aggressive use of `\~` is much more often
seen in groff man pages than in (any?) others, and groff man pages can
be expected to be formatted with groff or another `\~`-recognizing
formatter much of the time.

> It would probably be good to arrive at a consensus recommendation
> for such cases because many manual page authors probably have little
> interest in judging such questions themselves.  Consensus seems
> hard to reach though.  So maybe the best we can do is to simply
> state the fact that \~ is still not supported by a few not very widely
> used, but still somewahat significant roff implementations like Plan 9
> and Solaris, even though that forces authors to draw their own
> conclusion.

I could easily copy the (now-corrected with respected to the age of
mandoc's `\~` support) material about this escape sequence from our
groff Texinfo manual to groff_man_style(1), where the "Portability"
section quoted earlier in the thread is housed.

As with the uptake of groff man(7) extension macros (be they 15 years
old or more recent), a software project's documentors may be better
placed than we are to assess the formatting capabilities of their users.

Regards,
Branden

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux