Re: [PATCH 2/2] DocBook: Use a fixed encoding for output

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2015-09-11 at 13:30 -0600, Jonathan Corbet wrote:
> On Tue, 01 Sep 2015 23:49:19 +0100
> Ben Hutchings <ben@xxxxxxxxxxxxxxx> wrote:
> 
> > Currently the encoding of documents generated by DocBook depends on
> > the current locale.  Make the output reproducible independently of
> > the locale, by setting the encoding to UTF-8 (LC_CTYPE=C.UTF-8) by
> > preference, or ASCII (LC_CTYPE=C) as a fallback.
> 
> I guess I have to ask, though: doesn't it seem that having the docs
> produced according to the current locale is the Right Thing to do?  Users
> have their locale set as it is for a reason, it seems like the production
> of textual documents should respect their choice.
> 
> Am I missing something here?

Yes - the locale's character encoding applies to plain text, but rich
text formats can have a locale-independent encoding which the viewer
will automatically to the current locale's encoding.

For HTML, the document encoding can be explicit in the document header
(and is, in this case).

Manual pages were already consistently encoded in UTF-8, as this is the
default behaviour of DocBook-XSL (and is what man-db prefers as input).

PDF and Postscript documents have arbitrary and explicit mappings from
character numbers (or names) to glyphs, and PDF documents normally have
a mapping from glyphs back to Unicode code points to support searching
and copying text.

Ben.

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux