On Fri, Apr 21, 2017 at 11:35 PM, Jeff King <peff@xxxxxxxx> wrote: > On Fri, Apr 21, 2017 at 11:28:42PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> > I thought there was some "use" flag we could set to just make all of our >> > handles utf8. But all I could come up with was stuff like PERLIO and >> > "perl -C". Using binmode isn't too bad, though (I think you could >> > just do it as part of the open, too, but I'm not sure if antique >> > versions of perl support that). >> >> [Debugging perl encoding issues is one of the many perks of my dayjob] >> >> Using binmode like this is about as straightforward as you can get, >> the former occurrence could be equivalently replaced by: >> >> utf8::decode(my $line = <$fh>); >> >> But better just to mark the handle as utf8. There's a fancier way to >> do it as part of the three-arg-open syntax, but I couldn't remember >> whether all the perl versions we support have it. > > Yeah, I agree marking the handle is better. binmode is pretty > straightforward, but we'd have to remember to manually set it if we add > any other handles. That's probably not a big deal in this particular > script, though, which is pretty short. > >> About the "use" flag, you're probably thinking of the confusingly >> named "use utf8", but that's to set your source code to utf8, not your >> handles, e.g.: >> >> $ perl -CA -MDevel::Peek -wE 'use utf8; my $日本語 = shift; Dump $日本語' æ >> SV = PV(0x12cc090) at 0x12cded8 >> REFCNT = 1 >> FLAGS = (PADMY,POK,pPOK,UTF8) >> PV = 0x12de460 "\303\246"\0 [UTF8 "\x{e6}"] >> CUR = 2 >> LEN = 16 >> >> As you can see people got a bit overexcited about Unicode in the 90s. > > Yeah, I know "use utf8" doesn't work for that, but I was thinking there > was some other trick. Digging...ah, here it is: > > use open ':encoding(utf8)' > > No clue how portable that is. For such a small script it may be better > to just stick with vanilla binmode(). Yeah that would work, but doesn't work on 5.8.0, which is the lowest version we support.