A few concepts in Perl 6 are strange at first sight. They seem hard to understand, but it's only because they're new and different. They aren't deep mystical concepts known only to Tibetan lamas. Anyone can understand them, but it helps to start with a common-sense explanation.
This article looks at the concepts of "topic" and "topicalizer". The words aren't quotes from a particularly nasty bit of Vogon poetry. They're actually common terms from the field of linguistics ... which some might say is even worse. Still, the best way to understand topic in Perl is to understand its source.
Topic in Linguistics
Every larger unit of human language has a topic -- whether it's a sentence, a paragraph, a conversation or some other sizable chunk. The topic is the central idea of the unit. It's the focus of what's communicated. Native speakers usually have no trouble figuring out what the current topic is, when they think about it. If two little old ladies were talking over a cup of tea:
"I saw Lister yesterday."
"Really? What's he up to these days?"
"Oh, you know, drunk again, and mooning over that awful Krissy Kochanski."
etc ...
and someone asked an observer what the conversation was about, they would instantly reply "Lister".
Topicalizers in Linguistics
A topicalizer is simply a word that marks some thing or some idea as the current topic. In English, we have topicalizers such as "for", "given" and "regarding":
"For our first trick tonight, ladies and gentlemen, my partner Kryten will attempt to eat a boiled egg."
"Given that God is infinite, and that the universe is also infinite, would you like a toasted tea-cake?"
"Regarding topicalizers, I should point out that this sentence starts with one."
Topic in Perl
Now we need to adapt the linguistic definition of topic to Perl. In Perl, the topic is the most important variable in a block of code. It can be any variable: a scalar, array, hash, object. To be more accurate, it's the underlying storage location that's the topic. This might sound a little too abstract, but it's an important distinction. Variables are really only names we use to get at stored values. A single storage location can have multiple names. In English, "Rimmer", "he" and "the hologram" could all appear in a text meaning the same person. In Perl $_, $name, %characters{'title'} and any number of other variables could appear in a section of code as different ways of accessing a single value. And if that value is the current topic then all the variables connected to it are too. This will be important later.
At this point, the average reader is thinking "That's very interesting, but why do I care what the topic is? I've gotten along just fine without it all these years. Why start now?"
And the answer is: it's not required. No one has to understand topic to use it, any more than they have to understand gravity to catch a ball.
Why? It's a really simple rule. We'll call it the first law of topics:
Topic is $_. Any time a value becomes the current topic, $_ becomes
another one of the names for that value. We say $_ is aliased to it. So,
all it takes to use topic is to use $_ either explicitly or implicitly
in all the old familiar places, like chomp, print and
substitutions, and in a few new places, like when statements and with
unary dot.
Even so, it's still a good idea to understand topic. Understanding gravity makes a number of things that seem unrelated suddenly fit. Things like apples falling, planes crashing, the way the moon and sun move, baseball, and rollercoasters. It's the same with topic. Any programmer can use $_ without understanding topic. But when they understand topic, it becomes a logical system instead of a random collection of "things $_ does". I like logical systems.
Topicalizers in Perl
A topicalizer in Perl is a keyword or construct that makes a value
become the current topic. The current topicalizers are given, for,
method, rule, ->, certain bare closures, and CATCH
blocks, but the cast of characters keeps growing.
Coal and Switches
Perl 6's switch, the given construct, is the prime example of a
topicalizer. Its sole purpose is to make the value passed to it the
current topic within the block.
given $name {
when "Lister" { print "I'm a curryaholic." }
when "Cat" { print "Orange?! With this suit?!" }
when "Rimmer" { print "4,691 irradiated haggis." }
}
So, in this example, the given makes $name the current topic by
aliasing it to $_. Then the when statements compare against $_. After
the block, $_ regains the value in the outer scope. In Perl 6, $_ is
just an ordinary lexical variable and every topicalizer creates its own
$_ variable, lexically scoped to its associated block.
Fruit Loops and M&M's
The for loop is the classic topicalizer. It was topicalizing long
before most of us had a word for the activity. for is similar to
given, but instead of creating a single topic, it creates a series of
topics, one for each iteration.
for @orders {
when /scone/ {
print "Would you like some toast?";
}
when /croissant/ {
print "Hot, buttered, scrummy toast?";
}
when /toast/ {
print "Really? How about a muffin?";
}
}
Just like given, for takes a value, in this case the current
element of the array, and makes it the topic.
In simple cases like these, both for and given create the $_ alias
read-write. This is the same as Perl 5: any change to $_ inside the
block modifies the original value.
Bow and Arrow
The new and improved arrow (->) is the most flexible topicalizer.
It appears in a variety of different contexts. By itself ->
creates an anonymous subroutine just like sub.
-> $param { ... }
# is the same as:
sub ($param) { ... }
The only differences are that -> doesn't require parentheses
around its parameter list, and that -> topicalizes its first
parameter.
In the following example, the first expression creates an anonymous sub
and stores it in $cleanup. When the sub stored in $cleanup executes,
the $line parameter takes the string argument and becomes the current
topic, so both $line and $_ are aliased to the value of $intro. The
usual suspects, chomp, substitution and print then use the topic
as default.
$cleanup = -> $line is rw {
s:w/Captain Rimmer!/the bloke/;
$line _= " who cleans the soup machine!";
print;
}
$intro = "Fear not, I'm Captain Rimmer!";
$cleanup($intro);
Unlike the simple for and given, the arrow creates its aliases
read-only by default. The is rw property marks both the named alias
and the $_ alias as read-write. Without the property attached, any
statements within the block that modify $line or $_ cause a compile-time
error just as if they had been explicitly flagged is constant.
The arrow isn't limited to working alone. It can also combine with other topicalizers. When it does, it creates a named alias for the current topic.
for @lines -> $line is rw {
s:w/Captain Rimmer!/the bloke/;
$line _= " who cleans the soup machine!";
print;
}
As the for iterates over the array it aliases every element in turn
to $line and to $_. This takes the place of the Perl 5 way of aliasing a
loop variable:
# Perl 5
for my $line (@lines) {
$line =~ s/Captain Rimmer!/the bloke/;
$line .= " who cleans the soup machine!";
print $line;
}
The Perl 6 way has some added benefits, though. Since the arrow aliases
both $line and $_ to the current value, it works with the defaulting
constructs, like print, but also provides a more meaningful name than
$_ when an explicitly named variable is necessary.
The first example of for and the example of the anonymous sub
reference are fascinatingly similar. The only difference is one is
stored in a variable to be called later and one is tacked onto a for.
Really, all the for example has done is replace the loop's block with
an anonymous sub reference. This is the second advantage of the Perl 6
way. Because $line is now the parameter of a subroutine, it's
automatically lexically scoped to the block. The my happens
implicitly.
The arrow also combines with constructs that aren't topicalizers, like
if and while, and allows them to topicalize.
if %people{$name}{'details'}{'age'} -> $age {
print "$age already?\n";
if $age > 3000000 {
print "How was your stasis?\n";
} elsif $age < 10 {
print "How 'bout a muffin?\n";
}
}
As the if tests the truth value of the element of the data structure,
the arrow also aliases that value to $age and to $_. The example could
have accessed the hash of hashes of hashes directly each time it needed
the age value, but the short alias is much more convenient.
This feature is really only useful with simple truth tests. The truth
value tested in the following example isn't 3 or $counter, it's the
result of a complex conditional, $counter > 3.
if $counter > 3 -> $value {
# do something with $value
}
The result will be true or false, but if it's false, the block will never execute. In fact, when the truth test is false, $value is never aliased at all. It simply doesn't exist. A lexically scoped variable with a true value would have the same effect.
if $counter > 3 {
my $value = 1;
# do something with $value
}
So the arrow isn't a Ronco plum-pitter, yoghurt-squirter, do-everything
tool. When it's useful, it's very, very useful, but when it's not...
well... don't use it. :)
Method in my Madness
Methods topicalize their invocant. An invocant is the object on which a method is called. After saying that 10 times fast, anyone can see why it needed a name. The design team chose "invocant". Methods topicalize the invocant when it's a named parameter like $self:
method sub_ether ($self: $message) {
.transmit( .encode($message) );
}
and when it's left implicit:
method sub_ether {
.transmit( .encoded_message );
}
method sub_ether (: $message) {
.transmit( .encode($message) );
}
This is handy in short methods. The unary dot is just another defaulting
construct. Any method call without an explicit object executes on the
current topic. In the previous examples, .transmit is exactly the
same as $self.transmit and $_.transmit.
The Sub of All Fears
Unlike methods and the arrow, ordinary subs don't topicalize, at least not by default. The following example will either print nothing, or else print a stray $_ that is in lexical scope wherever the sub is defined.
sub eddy ($space, $time) {
print;
}
Subs can topicalize, though, with a little help from the is topic
property. The property flags a parameter as the topic within the
subroutine's block. It can be attached to any parameter in the list, but
not to more than one parameter at a time.
sub eddy ($space, $time is topic) {
print;
}
Built-in functions like print that default to the current topic when
they're called are incredibly useful. Wouldn't it be great to have
user-defined subroutines that behaved the same way? But, since $_ is now
just an ordinary lexical variable, a subroutine generally can't access
the topic from its caller. It can only access variables in the scope
where it is defined.
The is given property gives subroutines access to their caller's
topic. It accesses the topic within the caller's scope and binds it to
the property's parameter.
sub print_quotes is given($default) {
print "Random Quote: ";
print $default;
}
...
given $quote {
print_quotes;
}
The is given property can appear on subroutines with a full parameter
list as well. The only restriction is that the property's parameter
can't have the same name as a parameter in the full list.
sub print_quotes (*@quotes) is given($default) {
print "Random Quote: ";
if ( @quotes.length > 0 ) {
print @quotes;
} else {
print $default;
}
}
(A true simulation of the built-in print function would use multimethod dispatch, but that's outside the scope of this article.)
The property's parameter can have any name, but a parameter named $_ or
one that has the is topic property attached will set the caller's
topic as the topic within the subroutine as well.
sub print_quotes is given($_) {
# alias $_ to caller's $_
print; # prints the value of caller's $_
}
# or
sub print_quotes is given($default is topic) {
# alias $default to caller's $_
# and to $_ within the subroutine
print; # prints $default, the value of caller's $_
}
Perl Rules!
Grammar rules and closures within a rule topicalize their state object. This is convenient because it means methods on the state object can use the unary dot syntax.
m:each/ aardvark { print .pos } /
The state object for a rule is similar to the $self object for a method. It's an instance of a grammar class. Named rules are really just named methods called on the state object and anonymous rules and closures within a rule are really just anonymous methods called on the state object. Unfortunately, that's just enough information to be tantalizing without actually being useful, but a full explanation could take up an entire article. Still, knowing that the state object is like a $self object is a step in the right direction.
The CATCH-er in the try
CATCH blocks always topicalize the error variable, $!. This
streamlines the exception catching syntax because the CATCH block
acts as its own switch statement.
CATCH {
when Err::WrongUniverse {
try_new_universe();
}
}
This is much tidier than the equivalent:
CATCH {
if $!.isa(Err::WrongUniverse) {
try_new_universe();
}
}
The Bare Truth
Bare closures topicalize their first argument. If the block uses placeholder variables, the topic is also aliased to the Unicode-abetically first variable. The topic is lexically scoped to the block, but it is a read-write parameter, so modifications to $_ within the block modify the original value.
%commands = (
add => { $^a + $^b },
incr => { $_ + 1 },
);
Constructs like grep and map are no longer special because they
use $_ within the block argument. They simply benefit from the normal
behavior of bare blocks.
@names = map { chomp; split /\s*/; } @input;
Nesting Instinct
Nested topicalizers are a little more complicated. The following example
starts with $name as the topic and the first case matches against it.
Within the case, the print also defaults to the current topic. The
second case is a little more complicated; it contains a loop. Within the
loop is another print statement. This one also defaults to the
current topic which is... Hmmmm... is the topic $name or $quote?
given $name {
when /Rimmer/ {
print;
print rimmer_quote();
}
when /Kryten/ {
for kryten_quotes() -> $quote {
print;
}
}
}
The answer falls out of a few simple rules:
- There is only one topic at a time.
A series of nested topicalizers doesn't create a collection of topics. The interpreter doesn't have to sort through a complicated set of options to know what the current topic is. There will never be any ambiguity. A script or module may have a series of different topics, but only one at a time.
- Topic obeys the lexical scope of topicalizers.
For the programmer, determining the current topic is never any more complex than tracing outward to the nearest topicalizer. The scope of the topic is always restricted to the scope of the topicalizer that created it. So the example above will print the inner topic, $quote.
An equivalent pair of nested topicalizers in Perl 5 would have printed
the outer topic instead. That's because topicalizers never created a $_
alias at the same time as a named alias. It's a fairly common trick in
Perl 5 to use a named alias with a for loop to avoid clobbering $_.
That trick doesn't work anymore. Which brings us to the third rule:
- To keep outer topics, use a named alias.
It's just the old trick inverted. Instead of using a named alias with the inner topicalizer to avoid clobbering the outer topic, it uses a named alias with the outer topicalizer to access it after the topic has changed.
when /Kryten/ {
for kryten_quotes() -> $quote {
print $name;
print;
}
This really fits better with the way humans think. It makes more sense to give a name to the thing worth keeping than to give a name to the thing to be thrown away.
Methods have the same problem, but in their case it means that even though the unary dot is really handy in simple methods, in more complex methods with nested topicalizers it's better to use a named parameter for the invocant.
method locate ($self: *@characters) {
.cleanup_names(@characters);
for @characters -> $name {
.display_location($name);
}
.change_location('Holly');
}
The .display_location method won't call a method on $self, it will
try to call a method on $name, and fail (unless $name is an object with
a .display_location method). The code will have to call the method as
$self.display_location() instead. It would be clearer to add $self in
front of the other method calls as well, but that's entirely a matter of
style.
Multiple aliases
Another complication is that topicalizers aren't restricted to a single
alias. A for loop might iterate through an array several parameters
at a time:
for @characters -> $role1, $role2, $role3 {
...
}
iterate through several arrays, one after the other, taking a few parameters at a time:
for @humans, @betelgeusians, @vogons -> $role1, $role2 {
...
}
or iterate through several arrays at the same time:
for @characters; @locations -> $name; $place {
...
}
But no matter how complicated the code gets, the topic stays the same. The rules of the game are:
- There is only one topic.
This rule should look familiar. It might even deserve to be called "the second law of topics". With nested topicalizers, the restriction means that two topics from different scopes will never be accessible at the same time. With multiple aliases, it means that while a topicalizer can create more than one alias, only one of the aliases can be the topic.
- The topic is the first parameter.
This rule makes it easy to pick out the topic. In the first two examples, the topic is $role1, and in the third example it's $name.
There is one exception to the rule. The is topic property can select
any parameter as topic in place of the default first parameter.
for @characters -> $role1, $role2 is topic, $role3 {
...
}
The Final Frontier
That's pretty much all there is to topic. Hopefully, this article has pushed one more thing into the "Gee, that's easy!" category. But, if not, carry away one idea: that first law, "Topic is $_". The next time conversation turns to Perl 6 and topic, that simple translation will make it understandable.



