Please do not reply directly to this email. All additional comments should be made in the comments box of this bug report. https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=172792 Summary: use of study() with utf8 support enabled breaks regexps Product: Fedora Core Version: devel Platform: All OS/Version: Linux Status: NEW Severity: normal Priority: normal Component: perl AssignedTo: jvdias@xxxxxxxxxx ReportedBy: jvdias@xxxxxxxxxx QAContact: dkl@xxxxxxxxxx CC: fedora-perl-devel-list@xxxxxxxxxx Description of problem: Use of study() with utf8 support enabled breaks perl-5.8.7's regular expressions : OK without UTF: $ echo 'ABDCEFGHIJK' | perl -pe 'study; s/HIJK/1234/;' ABDCEFG1234 $ echo 'ABCDEFGHIJK' | perl -e '$_=<>; study; print /HIJK/,"\n";' 1 FAILS with UTF: $ echo 'ABDCEFGHIJK' | PERL_UNICODE=31 perl -pe 'study; s/HIJK/1234/;' ABDCEFGHIJK $ echo 'ABCDEFGHIJK' | PERL_UNICODE=31 perl -e '$_=<>; study; print /HIJK/,"\n";' (re did not match) Seems to be study() that is the culprit: $ echo 'ABDCEFGHIJK' | PERL_UNICODE=31 perl -pe 's/HIJK/1234/;' ABDCEFG1234 And it is because $_ gets utf8-ness from STDIN: $ echo 'ABDCEFGHIJK' | PERL_UNICODE=63 perl -e '$_=<>; study; print /HIJK/ ? "OK" : "FAIL","\n";' FAIL $ PERL_UNICODE=63 perl -e '$_="ABDCEFGHIJK"; study; print /HIJK/ ? "OK" : "FAIL","\n";' OK This was in the 'en_US.UTF-8' locale. If I make utf-8 support conditional on locale, the problem goes away for the C locale: $ echo 'ABDCEFGHIJK' | PERL_UNICODE=127 LC_ALL=C perl -e '$_=<>; study; print /HIJK/ ? "OK" : "FAIL","\n";' OK Version-Release number of selected component (if applicable): ALL perl versions How reproducible: 100% Additional Information: This is upstream perl bug 37646 ( http://rt.perl.org/rt3/index.html?q=37646 ) -- Configure bugmail: https://bugzilla.redhat.com/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.