[Bug 729972] New: Review Request: perl-Test-Mojibake - Check your source for encoding misbehavior

bugzilla@xxxxxxxxxx · Thu, 11 Aug 2011 08:37:48 -0400

Please do not reply directly to this email. All additional
comments should be made in the comments box of this bug.

Summary: Review Request: perl-Test-Mojibake - Check your source for encoding misbehavior

https://bugzilla.redhat.com/show_bug.cgi?id=729972

           Summary: Review Request: perl-Test-Mojibake - Check your source
                    for encoding misbehavior
           Product: Fedora
           Version: rawhide
          Platform: All
        OS/Version: Linux
            Status: NEW
          Severity: medium
          Priority: medium
         Component: Package Review
        AssignedTo: nobody@xxxxxxxxxxxxxxxxx
        ReportedBy: paul@xxxxxxxxxxxx
         QAContact: extras-qa@xxxxxxxxxxxxxxxxx
                CC: notting@xxxxxxxxxx,
                    package-review@xxxxxxxxxxxxxxxxxxxxxxx
    Classification: Fedora
      Story Points: ---
              Type: ---

Spec URL:
http://subversion.city-fan.org/repos/cfo-repo/perl-Test-Mojibake/branches/fedora/perl-Test-Mojibake.spec
SRPM URL:
http://www.city-fan.org/~paul/extras/perl-Test-Mojibake/perl-Test-Mojibake-0.3-2.fc17.src.rpm

Description:

Many modern text editors automatically save files using UTF-8 codification.
However, the perl interpreter does not expect it by default. Whilst this does
not represent a big deal on (most) backend-oriented programs, Web framework
(Catalyst, Mojolicious) based applications will suffer so-called Mojibake
(literally: "unintelligible sequence of characters"). Even worse: if an editor
saves BOM (Byte Order Mark, U+FEFF character in Unicode) at the start of a
script with the executable bit set (on Unix systems), it won't execute at all,
due to shebang corruption.

Avoiding codification problems is quite simple:

 * Always use utf8/use common::sense when saving source as UTF-8
 * Always specify =encoding utf8 when saving POD as UTF-8
 * Do neither of above when saving as ISO-8859-1
 * Never save BOM (not that it's wrong; just avoid it as you'll barely
   notice its presence when in trouble)

However, if you find yourself upgrading old code to use UTF-8 or trying to
standardize a big project with many developers, each one using a different
platform/editor, reviewing all files manually can be quite painful, especially
in cases where some files have multiple encodings (note: it all started when I
realized that gedit and derivatives are unable to open files with character
conversion tables).

Enter the Test::Mojibake ;)

-- 
Configure bugmail: https://bugzilla.redhat.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
_______________________________________________
package-review mailing list
package-review@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/package-review