Re: file-not-utf8 complaints

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Patrice Dumas wrote:
On Sun, Jun 01, 2008 at 10:17:32AM -0700, Toshio Kuratomi wrote:
Patrice Dumas wrote:
On Sat, May 31, 2008 at 04:09:25PM -0700, Toshio Kuratomi wrote:
However, the flipside of this is if a program has an xml config file that the user is expected to edit manually in a text editor and the program will adapt to multiple encodings (for instance, by using libxml2 to parse the file[1]_) having it exist in utf-8 is much better than having it exist in SOME_EXOTIC_ENCODING. In this case it's the program
I disagree. It is not an obvious choice and should be left to the
maintainer. It depends on the user target of the software, for instance.

Please state your counter example. I'm laying out the parameters by which we could relax the current rule. If we don't lay out the boundaries correctly the replacement rule will end up still being too restrictive.

I may be wrong, but it seems to me that there is no current rule? Except
that rpmlint warning/errors should be handled if possible, but there is
nothing about that in the guidelines (spec file and filename should be
utf8, though).

My bad, I must have been recalling the debates over the filename's must be utf-8 guideline. If there's no current guideline then I'm not sure we need a new one.

Here is a wording that would seem right to me:

Files that don't carry information about their encoding should be
converted to UTF-8. It is typically useful for NEWS files with author
names with acceented characters. There may be exceptions, for example a
README.cn file written in chinese may be encoded in a popular chinese
encoding like Big5.

I could go either way on this but lean towards this should be utf-8. ShiftJS, Big5, etc have benefits over UTF-8 and the people who use those are the consumers of this file. OTOH, for Fedora to truly support the UTF-8 locale out of the box, these kinds of files (which don't specify an encoding and aren't used by the program) have to be UTF-8. How can we ship with a UTF-8 locale by default knowing that the README.cn isn't readable by people who stick with our default?

Files that carry over their encoding (xml, tex, info...) may also be converted to UTF-8, but the decision is left to the package maintainer. It may be especially relevant for files that are to be edited by the user, since it may be difficult to edit a file not in UTF-8, while UTF-8 should be handled by most editors automatically, as the default for fedora is an UTF-8 locale.

This part seems quite reasonable as a recommendation.

-Toshio

Attachment: signature.asc
Description: OpenPGP digital signature

--
Fedora-packaging mailing list
Fedora-packaging@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-packaging

[Index of Archives]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Forum]     [KDE Users]

  Powered by Linux