Hey Michael & Branden! On 1/22/21 4:56 AM, G. Branden Robinson wrote: > Hi Michael! > > At 2021-01-21T12:03:13+0100, Michael Kerrisk (man-pages) wrote: >> I appreciate your long answer *very* much. But, I'm glad you started >> with the short answer :-). > > Cool! But beware, from such pressures is the practice of top-replying > born... ;-) > >>> Another issue to consider is that as PDF rendering technology has >>> improved on Linux, it has become possible to copy and paste from PDF >>> documents into a terminal window. In my opinion we should make this >>> work as well as we can. Expert Linux users may not ever do this, >>> wondering why anyone would ever try; new Linux users will quite >>> reasonably expect to be able to do it. > [...] >>> And I mean copy-and-paste not just from PDF but from a terminal >>> window. >> >> Yes, but I have a question: "\-1" renders in PDF as a long dash >> followed by a "1". This looks okay in PDF, but if I copy and paste >> into a terminal, I don't get an ASCII 45. Seems seems to contradict >> what you are saying about cut-and-paste above. What am I missing? > > The gap between aspiration and implementation. I don't think the > "copy-and-paste from PDF to terminal window" matter is completely sorted > out yet. > > I'm a strident prescriptionist about preserving the distinction between > "-" and "\-" in roff documents, notably including man pages in part > because it affords us more room to design around this problem. > > ASCII and ISO 8859 unified the hyphen and minus characters. AT&T troff > and all of its descendants distinguished them. Unicode also > distinguishes them. But Unix has a habit of calling ASCII 055 (45 > decimal) a "dash", and moreover, to much software, only the numerical > value of the code point is important. > > It's quite possible that for man(7) documents rendering to PDF, we > should perform the following mapping (in the man macros). > > .if '\*[.T]'pdf' \ > . char \- \N'45' > > This didn't come up in my argument with (mostly?) BSD people because (1) > the immediate issue that raised concern had to do with the grave accent > and apostrophe instead and (2) everybody in that camp who spoke up on > the matter said they seldom, if ever, render man pages to PostScript or > PDF. By that token, the above 2-liner may not be a controversial matter > to the people I was arguing with. :) > > Consider what would happen to the appearance of PDF-rendered man pages > if we encouraged all \- escaped hyphens to be rewritten as plain hyphens > in the source first, and did the following to mandate uniformity. > > .if '\*[.T]'pdf' \{\ > . char \- \N'45' > . char - \N'45' > .\} > > ...just as is currently done for the 'utf8' output driver, whose second > line I want kill off. > > I feel that responsible stewardship of the groff man macro > implementation means considering the needs of diverse audiences. > >> I don't really have any other questions, but I have tried to distill >> the above into some text in man-pages(7) to remind myself for the >> future: >> >> [[ >> .PP >> The use of real minus signs serves the following purposes: >> .IP * 3 >> To provide better renderings on various targets other than >> ASCII terminals, >> notably in PDF and on Unicode/UTF\-8-capable terminals. >> .IP * >> To generate glyphs that when copied from rendered pages will >> produce real minus signs when pasted into a terminal. >> ]] >> >> Seem okay? > > What a "real minus sign" is is a fraught issue[1], but if for the > purposes of man-pages(7) it means the ASCII/ISO hyphen-minus, then yes, > I think it's good enough. > > Regards, > Branden > > [1] especially in light of the \[mi] special character escape and the > existence of U+2212 :-/ > I just found another good reason to use '\-'. I was searching for an option of curl in their man page, and I used '/ -s', as I usually do when I search for those. To my surprise, it didn't find anything, in fact, '/-' just showed two appearances of the minus sign. However, if I copy and paste the character from one of the options and paste it into the pager search command line, then it finds the options. I already reported the bug to them. I checked that in our pages, we can search options (see time.1). I wonder if there are some cases where we're producing some weird character that can't be easily searched for. Regards, Alex -- Alejandro Colomar Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/ http://www.alejandro-colomar.es/