Hi, Alex! At 2022-03-20T01:04:17+0100, Alejandro Colomar (man-pages) wrote: > Michael introduced the following commit, which is incorrect (triggers > a groff(1) error; see below). Do you know what is intended here? > Could you please propose a fix? Sure! The punctuation does get a bit bewildering. The first topic is equivalence classes in globs. > LINT (groff) tmp/lint/man7/glob.7.lint.groff.touch > troff man7/glob.7 195 error '\`' is not allowed in an escape name > troff man7/glob.7 195 warning can't find special character '' > For example, "\fI[[=a=]]\fP" might be equivalent > -to "\fI[a\('a\(`a\(:a\(^a]\fP", that is, > +to "\fI[a\('a\(\`a\(:a\(^a]\fP", that is, UTF-8 continuation bytes follow in this message. So what we're trying to say is "[=a=]" might be equivalent to "[aáàäâ]" The man page is using groff special character escape sequences that are compatible with AT&T troff (1973) in _form_, but the special character _identifiers_ themselves are not portable that far back. The form is: \(xx ...where "xx" is _exactly_ two characters forming an identifier for a specific special character. As is somewhat well known, groff supports identifier of arbitrary length in escape sequences; anywhere AT&T troff has an escape sequence syntax form ending in "(xx", groff supports an additional form "[xxxxxxx]". Nota bene that word "identifier". The ones we see above are aliases for commonly used ISO Latin-1 (1985) characters. groff supports a more systematic notation for composite glyphs, that being \[base-glyph composite-1 composite-2 ... composite-n] and in the instant case, only one composite glyph is used. Glyph identifiers in groff must consist of valid identifier characters. The escape character \ is _not_ interpreted as an identifier character, but has its usual meaning of introducing an escape sequence. Thus, when encountering \(\`a the parser hits the expansion of \` and has problems. \` is itself an alias for another special character escape sequence: "\(ga". (This alias _is_ portable all the way back to AT&T troff, and is documented in Ossanna 1976, "Nroff/Troff User's Manual"--but that still doesn't make it a valid part of a special character identifier. Heirloom Doctools troff silently ignores it, and I thus suspect Unix V7 troff did too.) Thus, the special character you're naming has another special character as part of its identifier. That is not allowed. That is why an error is produced. Now, for the part people actually care about, which is how to fix it: take the escape character off of that `. You thus want +to "\fI[a\('a\(`a\(:a\(^a]\fP", that is, If you wanted to write this without using any aliases, you could adopt groff syntax. +to "\fI[a\[a aa]\[a ga]\[a ad]\[a a^]\fP", that is, I don't know if people regard that as more or less impenetrable. It is more _flexible_, and admits usage of diacritics/combining characters not envisioned by AT&T troff or ISO Latin-1. groff supports a baker's dozen. They are in a table titled "Accents" in groff_char(7) (1.22.4). > diff --git a/man8/zic.8 b/man8/zic.8 > index 940d6e814..aeca0e726 100644 > --- a/man8/zic.8 > +++ b/man8/zic.8 > @@ -293,7 +293,7 @@ nor > .q + . > To allow for future extensions, > an unquoted name should not contain characters from the set > -.q !$%&'()*,/:;<=>?@[\e]^`{|}\(ti . > +.q !$%&'()*,/:;<=>?@[\e]^\`{|}\(ti . You didn't proffer any complaints about the foregoing, so I assume it was just for context (to include the whole commit, maybe). Nevertheless I think it can be further improved. That neutral apostrophe and caret/circumflex should be changed as well, to ensure that they don't render as a directional closing (right) single quote, ’ U+2019 and modifier letter circumflex ˆ U+02C6. This advice is also in groff 1.22.4's groff_man(7) page. +.q !$%&\(aq()*,/:;<=>?@[\e]\(ha\`{|}\(ti . Moreover, as partly noted in our discussion about double quotes in macro arguments, there were no special characters for the double quote or neutral apostrophe in Unix troff. Since we're not getting 50 years of backward compatibility anyway, for the Linux man-pages project I recommend going ahead and using groff-style escape sequences for these. +.q !$%&\[aq]()*,/:;<=>?@[\[rs]]\[ha]\`{|}\[ti] . Are you willing to settle for 30 years of backward compatibility? ;-) In my opinion it is more helpful in dense contexts like this to have the paired delimiters [ ] to demarcate the glyph identifier then to achieve portability to systems that don't support identifiers you need anyway. (I note that `q` is a page-local macro and therefore bad style for portability reasons. That said, I have been _sorely_ tempted to add a `Q` macro for this precise purpose to groff man(7). I have hopes that it would give people something to reach for besides bold and italics for every damn thing.) Most--I hope all--of the above is discussed comprehensively in the current version of groff_char(7)[2], which I have rewritten completely since groff 1.22.4 and substantially modified even since the last Linux man-pages snapshot at <https://man7.org/linux/man-pages/man7/groff_char.7.html>. I now know the answers to many questions of the form "why the **** is {groff,troff} this way?", and have endeavored to share them. The "History" section is completely new. Regards, Branden [1] groff's own man pages are not without sin in this regard. I have cleaned them up a lot since 1.22.4, but a few adventurous stragglers remain that define and use page-local macros pervasively. All are on the long side. [2] https://git.savannah.gnu.org/cgit/groff.git/tree/man/groff_char.7.man I recommend that for source perusal only; do not try to render it with man-db man(1) or groff 1.22.4, because groff 1.23.0 will be adding a new macro, `MR`, for man page cross references[3] and its own pages have already been ported to use it. (This is where I flagellate myself for not having a groff 1.23.0-rc2 out yet. :( ) [3] https://git.savannah.gnu.org/cgit/groff.git/tree/NEWS#n165
Attachment:
signature.asc
Description: PGP signature