このページは大阪弁化フィルタによって翻訳生成されたんですわ。

翻訳前ページへ


Perl.com
The Wayback Machine - http://web.archive.org/web/20120428024921/http://www.perl.com:80/

Editor's note: Perl guru Tom Christiansen created and maintains a list of 44 recipes for working with Unicode in Perl 5. Perl.com is pleased to serialize this list over the coming weeks.

? 15: Declare STD{IN,OUT,ERR} to be UTF-8

Always convert to and from your desired encoding at the edges of your programs. This includes the standard filehandles STDIN, STDOUT, and STDERR.

As documented in perldoc perlrun, the PERL_UNICODE environment variable or the -C command-line flag allow you to tell Perl to encode and decode from and to these filehandles as UTF-8, with the S option:

     $ perl -CS ...
     # or
     $ export PERL_UNICODE=S

Within your program, the open pragma allows you to set the default encoding of these filehandles all at once:

     use open qw(:std :utf8);

Because Perl uses IO layers to implement encoding and decoding, you may also use the binmode operator on filehandles directly:

     binmode(STDIN,  ":utf8");
     binmode(STDOUT, ":utf8");
     binmode(STDERR, ":utf8");

Editor's note: Perl guru Tom Christiansen created and maintains a list of 44 recipes for working with Unicode in Perl 5. Perl.com is pleased to serialize this list over the coming weeks.

? 14: Decode program arguments as locale encoding

While it may be most common in modern operating systems for your command-line arguments to be encoded as UTF-8, @ARGV may use other encodings. If you have configured your system with a proper locale, you may need to decode @ARGV appropriately. Unlike automatic UTF-8 @ARGV decoding, you must do this manually.

Install the Encode::Locale module from the CPAN:

    # cpan -i Encode::Locale
    use Encode qw(locale);
    use Encode::Locale;

    # use "locale" as an arg to encode/decode
    @ARGV = map { decode(locale => $_, 1) } @ARGV;

Editor's note: Perl guru Tom Christiansen created and maintains a list of 44 recipes for working with Unicode in Perl 5. Perl.com is pleased to serialize this list over the coming weeks.

? 13: Decode program arguments as utf8

While the standard Perl Unicode preamble makes Perl's filehandles use UTF-8 encoding by default, filehandles aren't the only sources and sinks of data. The command-line arguments to your programs, available through @ARGV, may also need decoding.

You can have Perl handle this operation for you automatically in two ways, and may do it yourself manually. As documented in perldoc perlrun, the -C flag controls Unicode features. Use the A modifier for Perl to treat your arguments as UTF-8 strings:

     $ perl -CA ...

You may, of course, use -C on the shebang line of your programs.

The second approach is to use the PERL_UNICODE environment variable. It takes the same values as the -C flag; to get the same effect as -CA, write:

     $ export PERL_UNICODE=A

You may temporarily disable this automatic Unicode treatment with PERL_UNICODE=0.

Finally, you may decode the contents of @ARGV yourself manually with the Encode module:

    use Encode qw(decode_utf8);
    @ARGV = map { decode_utf8($_, 1) } @ARGV;
Visit the home of the Perl programming language: Perl.org

Sponsored by

Monthly Archives

Powered by Movable Type 5.13-en