Re: names of ISO 8859 encodings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Helge,

At 2024-12-14T06:12:15+0000, Helge Kreutzmann wrote:
> Am Fri, Dec 13, 2024 at 06:56:54PM -0600 schrieb G. Branden Robinson:
> > Oy vey.  Helge Kreutzmann submitted a similar bug report to groff
> > and I was planning to make the ISO -> ISO/IEC change to its man
> > pages.
> 
> I'm not going into the business of valuating which standards should be
> adhered to. But when referrring to the proper document the correct
> name should be given IMHO.

Possibly the "use/mention" distinction of linguistics would be helpful
here.[1]  In some technical discussion contexts, we merely _mention_ a
character encoding standard.  For instance, "This program is capable of
transliterating any document using an ISO/IEC 8859 character encoding to
valid UTF-8.".

In other contexts, we _use_ the identifier itself, perhaps as an input
argument to a program.  For example:

   $ iconv -f iso-8859-1 -t utf-8 NEWS

In this shell command, we must spell the character encoding specifiers
exactly as such,[2] and when documenting the foregoing in an example in
a man page, we are well advised to spell the hyphen-minus signs with
leading backslashes.

.RS
.EX
$ \c
.B "iconv \-f iso\-8859\-1 \-t utf\-8 NEWS"
.EE
.RE

Alex, do you think this issue is enough of a trip hazard to warrant
presentation in man-pages(7)?

> My personal opinion is that correct typography is important, but on
> quick reading I probably would not spot the differences amongs the
> various dashes for example. So for me, having all the correct letters
> is important and of course, to copy and paste text (e.g. code) where
> necessary, even if that violates typography standards.

I think we can avoid violating standards of typography; more precisely,
the process of rendering to an output device of limited capability will
violate those standards for us.[3]  For example, a character-cell
terminal device generally can't (1) render arbitrary glyphs sequences
superscripted or subscripted[4]; (2) change the type size;[5] or (3)
change the font family (to use letterforms with or without serifs) for
only part of the rendered text (as opposed to the entire display,
including scrollback buffer) at once.

> And yes, I'm well aware that Branden and Donald Knuth (and successors)
> strive for well printed documents, and I'm glad for this.

That's pretty august company to be paired with.  Lest anyone get any
inflated notions of my role in groff, Joe Ossanna of Bell Labs wrote
troff in the mid-1970s.  After his untimely death, Brian Kernighan
refactored troff circa 1980 into "device-independent troff".  These were
proprietary to AT&T (and commercial products for a while), so the FSF
hired James Clark to write a clean reimplementation of AT&T troff,
called groff, in about 1989.  Werner Lemberg later became groff
maintainer and added many features to it such that it became a viable
alternative to TeX in many more applications (partisan preferences
aside).  Then Bertrand Garrigues did some mostly unsung but badly needed
work on groff's build system, making it more pleasant to work with.  My
role has largely been (1) fixing bugs; (2) writing automated tests to
(try to) ensure that dead bugs stay dead; (3) revising and correcting
documentation; and (4) making modest extensions and reforms to the *roff
language and some of the macro packages, provoking heated arguments
and/or revealing formerly unspecified behavior, around which some people
of course poured fast-drying cement in fits of delirium years ago.

In software as in religion, the commandments held most sacrosanct are
those that no one thought to write down in the first place.

("Of _course_ I can interchange pointers and ints.  No one ever said I
can't!"  Eventually, they did say so.  To much gnashing of teeth.)

Regards,
Branden

And now the footnotes, where we play free-association rambling bingo.

[1]  https://en.wikipedia.org/wiki/Use%E2%80%93mention_distinction

[2]  a given system's iconv(1) command may recognize alternative names
     for some encodings

[3]  For example, the bash(1) man page contains this:

.if n Bash is Copyright (C) 1989-2024 by the Free Software Foundation, Inc.
.if t Bash is Copyright \(co 1989-2024 by the Free Software Foundation, Inc.

     In principle, this shouldn't be necessary.  Chet should just write
     the second line without the ".if t" conditional and delete the
     first.  The output device should know how to gracefully map the
     special character "\(co" to a copyright sign, and itself do the job
     of translating it to "(C)" if it has only an ASCII repertoire.
     Presumably, at some point in the past Chet (or the initial Bash
     maintainer, Brian Fox) used an nroff program that was defective,
     and also labored under the no-longer-correct misconception that
     omitting a copyright symbol from one's notice was a fatal defect
     that effectively placed the work in the public domain.  That
     stopped being true as of 1 March 1989.[7]  Further, prior to
     guidance issued by the U.S. Copyright Office in the decades since,
     the use of "(C)" as a substitute for a copyright sign _may not have
     sufficed_ to prevent the copyright notice from being regarded as
     defective.  The Copyright Office, then and now, prefers the
     abbreviation "copr." when © is typographically unavailable.[7]
     Nowadays, its advice is that "c" (note lowercase) is an "acceptable
     variant", that _may_ retain the efficacy of the copyright notice.
     However, it is not the U.S. Copyright Office but the courts that
     ultimately arbitrate such things.  Moreover, given recent
     developments, the Office's guidance to authors need not carry any
     weight to a federal judge.  Between the U.S. Supreme Court's repeal
     of "Chevron Deference"[8] and the availability of a federal
     district court in Western Texas offering itself as a venue to any
     right-wing plaintiff in the country and pursuing a crusade of
     maximalist Federalist [read: monarchist] Society doctrine with a
     penchant for issuing nationwide permanent injunctions,[9][10] the
     status of any federal statute, executive agency guidance, or even
     constitutional provision[11] is uncertain for the next few years at
     least.  But rest assured--we term this sort of radical disruption
     of American jurisprudence a "conservative" judicial philosophy.  👍

[4]  Often, the decimal digits 0-9 are available as superscripts.  This
     selection is too meager for general typography, let alone
     mathematical typesetting where arbitrary, complex expressions may
     occur in exponents, for instance.  Occasionally you need an
     integral up there.

[5]  The DEC VT100 and its successors could do double-width and
     double-size type.[6]  Try this in your preferred terminal emulator.

     $ printf "$(tput bold)\e#3See also\n\e#4See also$(tput sgr0)\n\
          $(tput sitm)xterm$(tput ritm)(1)\n\n\e#6Patch #395    2024-09-11\
          $(tput sitm)xterm$(tput ritm)(1)\e#5\n"

     Anyone think these are worth supporting in grotty(1)?  ;-)

[6]  https://vt100.net/docs/vt510-rm/DECDHL.html
     https://vt100.net/docs/vt510-rm/DECDWL.html

[7]  https://www.copyright.gov/circs/circ03.pdf
[8]  https://www.scotusblog.com/2024/06/supreme-court-strikes-down-chevron-curtailing-power-of-federal-agencies/
[9]  https://www.americanprogress.org/article/the-5th-circuit-court-of-appeals-is-spearheading-a-judicial-power-grab/

[10] I would not personally wager that copyright holders have much to
     fear under the current regime; revenues consequent to copyrights
     are a form of monopoly rent and therefore a worldwide tent pole of
     conservative political economy.  But, if a poweful stakeholder has
     a prospect of a sufficiently large windfall from a radical change
     to copyright protections, and is willing to spend lavishly enough
     on political campaigns and super PACs, who knows what might happen?

     Here's some model statutory language.  "Any work under copyright by
     any entity other than the Walt Disney Company, its subsidiaries, or
     affiliates, enters the public domain as of January 1 of the year
     subsequent to its fixation in tangible form."

     I mean, that's just "common sense", right?[12]  Only Disney has any
     business adapting anything into a feature film, or exercising
     merchandising rights.  Duh.

[11] https://www.cbsnews.com/news/what-is-birthright-citizenship/

[12] another term debased by conservative/centrist political rhetoric

     I offer my own definition, in the spirit of Ambrose Bierce.

     "Commonsense solution": a course of action I want to take for
     reasons I will not share with you.

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux