[Bug 172792] New: use of study() with utf8 support enabled breaks regexps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Please do not reply directly to this email. All additional
comments should be made in the comments box of this bug report.




https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=172792

           Summary: use of study() with utf8 support enabled breaks regexps
           Product: Fedora Core
           Version: devel
          Platform: All
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: normal
         Component: perl
        AssignedTo: jvdias@xxxxxxxxxx
        ReportedBy: jvdias@xxxxxxxxxx
         QAContact: dkl@xxxxxxxxxx
                CC: fedora-perl-devel-list@xxxxxxxxxx


Description of problem:

Use of study() with utf8 support enabled breaks perl-5.8.7's
regular expressions :

OK without UTF:
$  echo 'ABDCEFGHIJK' | 
   perl -pe 'study; s/HIJK/1234/;'
ABDCEFG1234

$ echo 'ABCDEFGHIJK' |
  perl -e '$_=<>; study; print /HIJK/,"\n";'
1

FAILS with UTF:
$ echo 'ABDCEFGHIJK' |
  PERL_UNICODE=31 perl -pe 'study; s/HIJK/1234/;'
ABDCEFGHIJK

$ echo 'ABCDEFGHIJK' | 
  PERL_UNICODE=31 perl -e '$_=<>; study; print /HIJK/,"\n";'

(re did not match)

Seems to be study() that is the culprit:
$ echo 'ABDCEFGHIJK' | 
  PERL_UNICODE=31 perl -pe 's/HIJK/1234/;'
ABDCEFG1234

And it is because $_ gets utf8-ness from STDIN:

$ echo 'ABDCEFGHIJK' |
  PERL_UNICODE=63 perl -e '$_=<>; study; print /HIJK/ ? "OK" : "FAIL","\n";'
FAIL

$ PERL_UNICODE=63 perl -e '$_="ABDCEFGHIJK"; study; print /HIJK/ ? "OK" :
"FAIL","\n";'
OK

This was in the 'en_US.UTF-8' locale. If I make utf-8 support
conditional on locale, the problem goes away for the C locale:

$ echo 'ABDCEFGHIJK' |
  PERL_UNICODE=127 LC_ALL=C perl -e '$_=<>; study; print /HIJK/ ? "OK" :
"FAIL","\n";'
OK

Version-Release number of selected component (if applicable):
ALL perl versions

How reproducible:
100%

Additional Information:

This is upstream perl bug 37646 ( http://rt.perl.org/rt3/index.html?q=37646 )

-- 
Configure bugmail: https://bugzilla.redhat.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


[Index of Archives]     [Fedora Announce]     [Fedora Kernel]     [Fedora Testing]     [Fedora Legacy Announce]     [Fedora PHP Devel]     [Kernel Devel]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Big List of Linux Books]     [Gimp]     [Yosemite Information]
  Powered by Linux