Editor's note: Perl guru Tom Christiansen created and maintains a list of 44 recipes for working with Unicode in Perl 5. Perl.com is pleased to serialize this list over the coming weeks.
? 17: Make ?le I/O default to utf8
If you've ever had the misfortune of seeing the Unicode warning "wide character in print", you may have realized that something forgot to set the appropriate Unicode-capable encoding on a filehandle somewhere in your program. Remember that the rule of Unicode handling in Perl is "always encode and decode at the edges of your program".
You can easily Decode
STDIN, STDOUT, and STDERR as UTF-8 by
default or Decode
STDIN, STDOUT, and STDERR per local
settings as a default, or you can use binmode
to set the encoding on a specific filehandle.
Alternately, you can set the default encoding on all filehandles through the
entire program, or on a lexical basis. As documented in perldoc perlrun, the
-C flag and the PERL_UNICODE environment variable are
available. Use the D option to make all filehandles default to
UTF-8 encoding. That is, files opened without an encoding argument will be in
UTF-8:
$ perl -CD ...
# or
$ export PERL_UNICODE=D
The open pragma configures the default encoding of all filehandle operations in its lexical scope:
use open qw(:utf8);
Note that the open pragma is currently incompatible with the autodie
pragma.




