Re: file-not-utf8 complaints

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Toshio Kuratomi wrote:
Jason L Tibbitts III wrote:
Normally we fix up non-utf8 documentation and such with a quick call
to iconv.  It seems that this is problematic for some; see
https://bugzilla.redhat.com/show_bug.cgi?id=226079

Any comments on how much we actually care about this, especially in
the case that it might not actually be as easy as a call to iconv
(such as a changelog file with a pile of random encodings in it).

Well... The reason that all files must be UTF-8 is exactly the problem that the ChangeLog exhibits so I don't have a lot of sympathy there.

+1,

Although I fully agree with Daniel that blindly converting text-ish files which actually specify an encoding in their headers is both wrong and dangerous as that actually breaks stuff, normal text files, esp. ones in %doc should be in UTF-8, so that when opened they display correctly.

Indeed the changelog is a perfect example of why all plain text files must be UTF-8, had it always been UTF-8 the problems between part being in west-european encoding and parts in east-european encoding would not exist.

Also I think its worth noting that Fedora is not the only distro doing this, Debian for example also tries to have all text files in the distro in UTF-8.

I'll also put a comment to this extend in the review.

Regards,

Hans



 The
names and special characters in that file are already corrupted since there's no common encoding and none is recorded with the names. Dropping it from the package, as Daniel expressed is certainly an option as there's no requirement that ChangeLogs need to be in a package and it is not something that must be changed.

Reencoding the xml files that specify an encoding isn't strictly necessary. We should probably ask upstream whether they are amenable to changing to utf-8. Since libxml2 deals with utf-8 internally and the upstream author made a nice writeup about why he made that choice, upstream might be amenable to that. If upstream is not amenable, we should consider changing the Packaging Guidelines to reflect that xml files which specify their encoding do not have to be re-encoded utf-8. (Although we then have to ask ourselves if we should be checking that the xml files actually use the encoding that they specify :-(

NEWS and other files that are neither specifying an encoding nor mixed up in such a way that they are hopelessly corrupted WRT the original characters should definitely be converted to utf-8. If Daniel wants to hold open the Merge Review until that has gone in upstream, that is his perogative.

The most chilling aspect of that review is that the maintainer does not seem to think that it's his responsibility to take issues with the upstream source to upstream. Since Daniel is upstream, I'm not certain I can see why he feels that someone else should be reporting it upstream before he deals with it.

-Toshio


------------------------------------------------------------------------

--
Fedora-packaging mailing list
Fedora-packaging@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-packaging

--
Fedora-packaging mailing list
Fedora-packaging@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-packaging

[Index of Archives]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Forum]     [KDE Users]

  Powered by Linux