? 34: Unicode column-width for printing
Perl's printf, sprintf, and format
think all codepoints take up 1 print column, but many codepoints take 0 or 2. If
you use any of these builtins to align text, you may find that Perl's idea of the width of any codepoint doesn't match what you think it ought to.
The Unicode::GCString
module's columns() method considers the width of each codepoint
and returns the number of columns the string will occupy. Use this to determine
the display width of a Unicode string.
To show that normalization makes no di?erence to the number of columns of a string, we print out both forms:
# cpan -i Unicode::GCString
use Unicode::GCString;
use Unicode::Normalize;
my @words = qw/cr?me br?l?e/;
@words = map { NFC($_), NFD($_) } @words;
for my $str (@words) {
my $gcs = Unicode::GCString->new($str);
my $cols = $gcs->columns;
my $pad = " " x (10 - $cols);
say str, $pad, " |";
}
... generates this to show that it pads correctly no matter the normalization:
cr?me |
cr?me |
br?l?e |
br?l?e |
Previous: ? 33: String Length in Graphemes
Series Index: The Standard Preamble
Next: ? 35: Unicode Collation



