|
このページは大阪弁化フィルタによって翻訳生成されたんですわ。 |
Planet Perl is an aggregation of Perl blogs from around the world. Its an often interesting, occasionally amusing and usually Perl related view of a small part of the Perl community. Posts are filtered on perl related keywords. The list of contributors changes periodically. You may also enjoy Planet Parrot or Planet Perl Six for more focus on their respective topics.
Planet Perl provides its aggregated feeds in Atom, RSS 2.0, and RSS 1.0, and its blogroll in FOAF and OPML
There is life on other planets. A heck of a lot, considering the community of Planet sites forming. It's the Big Bang all over again!
This site is powered by Python (via planetplanet) and maintained by Robert and Ask.

Planet Perl is licensed under
a Creative
Commons Attribution-Noncommercial-Share Alike 3.0 United States
License. Individual blog posts and source feeds are the
property of their respsective authors, and licensed
indepdently.
UK and Ireland thank-yous
I've spent the last few days travelling around the UK and Ireland,
and it's just been lovely. By far one of the nicest things about
travelling is the hospitality and generosity of the Perl community,
and so I want to take a moment to say thank-you to a few special
people who made my travels so enjoyable.
Thank-you to Drew, Kimberley, and Samantha for giving up their home in Dublin, driving around two crazy Australians, and picking us up when we get completely lost. An extra special thanks to Kimberley for some amazing cooking; it's not every day that we get to stay with a professional chef!
Thank-you to Murray and Becky for giving us booze, conversation, lodging, and wifi in their most amazing Edinburgh home. I really wish I got to spend more time in Edinburgh; both the city and its people are beautiful, and haggis is plentiful and tasty.
Thank-you to Andy Armstrong for driving us around the spectacular English countryside, putting us up for the night (and sleeping on the couch!), and then driving us on to Darlington. Given the cost of trains in the UK, Andy saved us a small fortune. Particular thanks goes to Andy for the trip to Hadrian's Wall and Vercovicium, which was spectacular!
Thank-you to my Great Aunt Jennie, who isn't at all involved in the Perl community, but gave us lunch, tea, great conversation, and acted as a guide for many miles of walking and photography around Darlington.
Thank-you to the charity shop five minutes walk from Great Aunt Jennie's house, which had an almost complete Space Hulk 1st Edition set with expansion cards for only £1.50.
Thank-you to Smylers, for putting us up in Leeds, and in particular for putting up with me running off around the city with my pirate friends.
Finally, thank-you to L?on Brocard for organising a London.pm meeting, and for Piers Cawley for the lift to the airport.
Posted: 17th August 2009.
Tags: ireland perl travel uk yapceu2009
Bookmark:
Need a lot of work, but I have an unnamed Web site that I'm playing with. Initial data is populated from Freebase (which I'm still learning) and it uses DBIx::Class and Catalyst. I've put the source code on Github.
The only thing it really does now is let you browse countries (Freebase code to fetch data at util/country.pl) and see the Google maps for them. I might add more as I have time, but with a trip to Portland coming up, I might not.
Be kind. It's running off the default Catalyst server and SQLite :)
With apologies to JJ for stealing CSS and images (I'll take 'em down if you like).
Note that while most development is happening on my local box, I'll sometimes push changes out to Hexten (hosted by Andy Armstrong, if you're curious). Thus, the site might sometimes be unavailable. It's just a toy after all.
Two weeks ago I attended YAPC::Europe 2009 in Lisbon, Portgual. This wasn't the first YAPC in Portugal - in 2005 we went to Braga, a university town inland and it was a very well organised conference. It's no surprise the YAPC in Lisbon was also amazingly well organised, as it was the same organisers - José Castro (cog), Alberto Simões (ambs) and Magda Joana Silva. Thank you so much José, Alberto, Magda and all the others involved - it was amazing.
The theme of the conference was "Corporate Perl" and this made a useful track in the vast schedule. It's amazing all the places Perl is used. Every talk I attended was great - the speakers knew their material and explained their content well. Some of the presentations are already online, see "Talk" in the schedule.
I liked the venue, three metro stops away from our hotel (also very walkable). The main room was huge and the three other tracks were just nearby, leaving a large space behind these for socialising with the just-over 300 attendees (and easy access to the sun). This was also where the terribly-important refreshments were, including oh-so-tasty pastéis de Nata.
The attendees' dinner was great. It was in a huge churrascaria (meat on swords!) by the river. It was also a good location for the Quizz Show, with 16 two-person teams fighting to be the geekiest. This was quite hard, as the questions varied from Portuguese history, Star Trek, Buffy and Perl internals ("How many levels of precedence does Perl 5.10.0 have?") to Unix history.
Community was very important at the conference - there are lots of seperate groups of Perl people clustered around the core, Perl modules, Perl projects or even cities and conferences are where you can mix, meet people in real life and start making crazy plans for the future. There are so many exciting things going on in Perl and I'm really looking forward to YAPC::Europe in Pisa in 2010!
Mark Stosberg kicks ass. Note only is he a driving force in CGI::Application, he's also working on some scutwork in CGI.pm. He got privileges to the CGI.pm bug queue and has closed 100 of the 150 bugs in the queue. (He's also done this for me with the WWW::Mechanize queue, too)
Note that I didn't say he fixed 100 tickets, but that he closed 100 tickets. As he notes:
Many had already been addressed by Lincoln some time ago. Those were simply closed. Still, I found about 20 fairly ready-to-go patches, and those have now been processed and released today as CGI.pm 3.45. Whenever code changes were made, I also strived to make sure new automated tests were added that covered those cases. You may be surprised how many methods in CGI.pm have no automated tests for them at all.
Although Mark Stosberg is a fine programmer in his own right, what he's been doing here requires no special skills, no megabrain rockstar code chops. He's doing project management and cleanup. It's so simple, even you can do it.
Yes, I'm talking to you out there, thinking "I've always wanted to get into open source projects," whether it's to improve your career or just to help out. There are always ways you can pitch in that don't involve writing new code. I hope you'll find some and join us.
While reading this review of the wildly popular Flip video recorder, I was struck by yet another situation in which popularity has resulted from a ruthless focus on the First Time Experience.
http://www.nytimes.com/2008/03/20/technology/personaltech/20pogue.html?_r=1&page wanted=all
David Pogue compared the experience of using the Flip to the use of the Graffiti on the Palm.
Jeff Hawkins, a brain scientist, knew something about people: if you're successful at something the first time you try, you fall instantly in love with it. And sure enough: people fell in love the first time they wrote on a Pilot with the special alphabet and saw their letters turn into perfectly typed text. ...
Somebody at [Flip maker] Pure Digital must have sat through countless meetings, steadfastly refusing to cede any ground to the forces of feature creep. A juicy bonus, if not a lucrative speaking career at management seminars, is definitely in order.
This is the same principle that made PHP so successful, and makes the iPhone interface so compelling (and has made a number of Apple projects into such compelling products). It's also what I was trying to target with Strawberry Perl (for the audience of Unix Perl people anyway).
The new "Six" installer (that installs Perl 5, Padre and Rakudo at the same time) is definitely shiny, but as it stands it is most certainly NOT First Time Experience friendly.
Installation works fine, and so does starting up Padre.
But the next most obvious action, (File -> New -> Perl 6 Script) causes Padre to immediately crash. The Perl 6 plugin is also not on by default (so Perl 6 support doesn't actually work) and Perl 6 execution only works if you save as a .p6 file, which you have no way of knowing that you have to do, because the Save dialog doesn't suggest a file name for your code.
So please, by all means, if you are tolerant and familiar with Padre or want to help us out then install Six and try it.
But please restrain yourself for promoting it. Yes, it's neat. But it's VERY alpha and simply not ready for the unwashed Perl 6 curious masses to try out.
Laurent Dami blogged Object-oriented accessors considered (sometimes) harmful and Yuval Kogman responded to it.
I agree that accessing hash elements directly can be tempting, but for me the killer argument in favor of accessor methods is polymorphism. Simply put, if you access hash elements directly, you're effectively making it completely impossible for someone else to subclass your class, or for your own class to subclass another class later. That's all.
And I agree strongly with one of Laurent's criticisms: With accessors, you never know where things really happen, making them difficult to debug. To use Laurent's example:
if ($c->request->body->length < $self->min_body) {
You have to step through a lot of methods to get to the right-hand side. It's the same with any expression.
What I would really like in the Perl debugger (but ENOTUITS) is the possibility to inspect the expression in a precedence-based tree-view like this:
1: < 2: $c 3: ->request 4: ->body 5: ->length 6: $self 7: ->min_body
and then being able to set a one-time breakpoint for line 7, that is, when
entering the min_body() method.
I always enjoy reading the papers of the SIGGRAPH conference. It's nice to see what new graphical techniques are coming. Here are my favourite picks from 2009:
The last one contains great quotes like "A stretched cow that is torn when two bars scissor together" and "These images from an animation show viscoelastic horses being dropped onto one another".
Is there anyone who never played Frozen Bubble? Is there anyone who does not know that it is written in Perl?
Apparently desktop game writing was not high on the agenda of the Perl programmers but it seems SDL_Perl, the Perl binding to Simple Directmedia Layer recently started a new life.
Work begun to improve the binding, a few new people got access to the web site so we started to update that too.
I think this can be yet another area where Perl could get people interested in the language and besides that it is way cool. So I took upon myself to try to help the project a bit. There are two areas where I can help.
One of them is to improve the content of the website (not the look, that needs to be done by someone else). So the first thing I would like to do is to compile a list of projects already using SDL Perl. If you have such a project even a small one or a really new one, such as the Perl Invadors of Thomas Klausner, please let me know either here as a comment or on the SDL mailing list. I'll add a link to your project from the SDL web site.
The other area is helping to build a plugin for Padre that will make developing SDL applcations really easy. The problem of course is that I never used SDL before and have no idea what extra help an SDL Perl developer might need but if you have ideas, please let me know or get on board and help building the plugin.
This is response to Laurent Dami's post about when to prefer direct access to the hash fields of an object over an encapsulated API.
The point of the article is that encapsulation is a tradeoff, especially in Perl where you could do some things more easily or efficiently if you decide to give it up.
While I agree that this tradeoff exists, I think that it's not the only tradeoff you're allowed to make, and that Laurent's reasoning in defense of hash access can be solved in ways that don't involve breaking encapsulation and that are in my opinion superior.
In short, I find most of his examples are either better solved by Moose, or fall short of being substantial in my opinion.
I will attempt to refute his arguments one by one.
Admittedly the fact that the results of object methods or subroutine application cannot be easily interpolated into strings is a big annoyance in Perl, but that's far from a good reason to avoid them.
I might be alone on this, but personally I find the rest of these examples quite troubling. I think they miss the point of OO programming.
By refactoring these examples into a real API we no longer need the comments to explain what they're actually doing. Maybe the code implementing these methods is not as idiomatic, but we don't need to know it's there (and since we don't know, it could be that it actually is as idiomatic as he likes under the hood.
The total lines of code may be longer for a single usage, but most applications that manipulate points in a 2D space would need to do these operations many times, reducing the overall verbosity:
my $opposite = $point->symmetry_transform;
my $zoomed = $point->zoom($factor);
# I'd actually use a wrapper object for this
# (DisplayPoint with a Menu and a Point delegate)
my $point = Point->new_with_traits(
traits => [qw(Menu)],
menu => { color => "green" },
...
);
There is one final example:
# temporary push aside
{ local $point->{x} += $far_away;
do_something_without_that_point_in_the_way();
} # point automatically comes back to its previous location
And though for this one there is no direct correspondence[1], the problem is that instead of passing the set of relevant points to do_something_without_that_point_in_the_way(), the point is mutated to temporarily hide its effects.
I've posted on immutability before and why I think that this approach is wrong, but suffice it to say that I would write the above using an explicit collection of points, and make the operation of creating new collections easy instead of monkey patching the data.
This problem can be solved by using an alternate method naming style. For instance MooseX::Policy::SemiAffordanceAccessor makes the default setter for an attribute named foo into set_foo.
Personally I feel this is overkill. If the relationship between the object and the method name is obviously that of a noun and an attribute, such as $thing->size(10), that is far from ambiguous.
For cases that are more involved, like a field manipulation that requires additional actions I tend to have a method name that has a verb in it and keep the setter completely private. Since the method would be doing something more than just set_ that is reflected in the API.
This is a deficiency in the Perl debugger. I strongly believe that the maintainability of your code shouldn't suffer because of inferior tools. Someone might fix the Perl deubgger one day. Though unlikely, it's much more unlikely that they will also all of my code.
Since the inner workings of Catalyst is obviously not what Laurent wants to debug in the example, the problem lies in the Perl debugger's inability to handle complex (and idiomatic) perl expressions.
Furthermore, in this particular case it shouldn't be a problem, $self is known and you could simply set a breakpoint for that class's min_body method and continue execution instead of stepping through every operation.
Secondly, if Catalyst had supported lazy body parsing, in which case that line might be run before the body length is actually known (the body may not have been fully read), then this operation wouldn't be a simple accessor fetch but rather a blocking read and an update of the request object.
This polymorphism in the API is precisely what makes it possible to enable such optimizations in future versions of Catalyst without breaking existing applications. If we simply used length($c->{request}{body}) directly, there would be no room for future improvements, even on an HTTP abstraction that could support that easily.
This problem in particular is solved using MooseX::AttributeHelpers (note that this will be in core Moose soon, after receiving a big facelift).
In the more general sense, Moose deeply supports the encapsulation and packaging of patterns using its meta trait system. By applying traits to attribute we can alter their behavior, providing additional functionality in a concise and declarative way.
There's no need to reinvent or copy because of Moose's extensibility, something that is missing from other class systems, and this applies not only to non scalar data, but to many other patterns for which MooseX modules were written.
Again, introspection is key here. KiokuDB would never be possible without generic data traversal for Moose objects. That's what sets it apart from Pixie.
I've been meaning to write a Data::Visitor::Moose for a long while but simply never got around to it. It's definitely possible, and even not that hard.
Here is a simple example demonstrating JSON integration:
method TO_JSON {
return {
map { $_->name => $_->get_value($self) }
$self->meta->get_all_attributes
};
}
But the really nice thing is that since all the attributes have extensible metadata we can be very particular about how we filter them. Unfortunately the JSON module's API doesn't allow us to specify parameters to the TO_JSON method, but if it would then we could very easily output different variants for different contexts.
From the point of view of the class being serialized (obviously the TO_JSON method could go in a role and get reused), we could decorate attributes with traits to determine their correct behavior. For instance an attribute that is used both by an ajax frontend displaying the data, and a REST Json feed for this object could be declared as:
has name => (
traits => [qw(MyApp::View MyApp::REST)],
...
);
The actual implementation involves two roles (MyApp::View and MyApp::REST) with no implementation, and a simple $attr->does($role) call in the TO_JSON method.
This is not only cleaner, but also more powerful than violating encapsulation. The difference is that the encapsulation provided by e.g. Class::Accessor or Object::Tiny is just not introspectable or extensible.
I won't argue about this, but in my experience even with large (some would say gratuitous) amounts of method calls I have never been able to demonstrate that accessors or other small methods were an actual performance issue (that is, using a profiler on real code with real inputs).
My dying laptop can make about 1,000,000 method calls per second. My $20 a month virtual private hosting image does 3,000,000. The difference between 5 method calls and 500 in a single dynamic web request is still too small to actually matter, at least for the kind of apps that I write. 100x as many method calls for the same amount of code will not be 100x slower, but will probably be 10x more maintainable ;-)
That aside, the XS branch for Moose has been performing almost as fast as direct hash access in microbenchmarks, while still providing full type validation, support for lazy attributes, access control, etc. This is because XSUBs are faster than pure perl subroutines (they require no context scope, and a single opcode dispatch). And of course these accessors can be subclassed or swapped out for more involved implementations should the need arise.
The XS branch is unmerged and unfinished, but there are a number of other XS accessor implementations on the CPAN if that is actually a problem in your code.
Laurent concluded his post saying:
In Perl, fully encapsulated objects are sometimes the best solution, sometimes not; weight these considerations before taking strong design decisions.
An interesting design is the one of DBI : objects (handles) are totally encapsulated, yet they exploit the power of tie to expose their attributes through a conventional hashref API, instead of OO getter and setter methods. This is a very clever compromise.
As far as I am concerned, I purposedly designed DBIx::DataModel to fully exploit the dual nature of Perl objects, having both the OO API for executing methods, and the hashref API for accessing column values. I wouldn't necessarily do that everywhere, but for row objects in a database, which are very open in nature, this just seemed an appropriate solution.
Specifically in the case of DBIx::DataModel, I think that hash access is a valid approach. But this is because DB records are not objects, but tuples. You can model tuples as objects, but not the other way around.
I couldn't agree more with his sentiments that this a tradeoff you should consider, but I think his cutoff point is very different from mine. I would need to be much more desperate to reach for "naughty" hash access when accessors would do.
Lastly, every time I've used the DBI api I really wished it didn't do that horrible hashref hack. Cleverness is bad, it's confusing to newbies, it's a red flag to experienced programs new to this API, and to reference his previous point about speed, tie is about 4 times as slow as normal method calls. To me that feature seems like a half hearted attempt to allow polymorphism and encapsulation in an API where methods would break compatibility.
This complexity makes understanding DBI internals a lot harder, and by proxy makes writing DBDs harder too. This is a fact of life that we deal with because DBI has been around for longer than I've known to program and because it's still robust and performant. It has flaws but it makes up for them. That doesn't mean the tied hash attributes were a good idea.
I feel that my decision to be an OO purist, especially in a language where impurities are so tempting has made my job as a maintenance programmer much easier for me and my coworkers.
To me polymorphism is a far more important idiom than concise hash manipulation. It may not be unique to Perl, but it's damned useful.
[1] $point->localize( x => $new_value, body => sub { ... } )
The Perl 6 design team met by phone on 12 August 2009. Larry, Allison, Patrick, Nicholas, and chromatic attended.
Allison:
Larry:
@_ and %_ to a sub if they use them in the bodyvoid Nil, it falls out of existing syntaxNil cast, but Damian pointed out that we couldn't put multiple statements as an argumentvoid prefix is more readable and allows the block form for multiple statementsCaptures to Parcelssub() at the front of an anonymous sub declarationnofun assertion has a problemdefines infix specced for importation without loading#` without opening bracket and complainsviv and gimme5 to converge to the same outputPatrick:
Any instead of Object, and I fixed thatprint and say from PIR into the Setting quote rule, but now they're circumfix Parcel design works in the specNicholas:
c:
Patrick:
Parcel changes in the spec so farLarry:
Imagine that you need to store the following tree structure in your database:
5---
/ \ \
62 32 7
/ / / \
9 3 29 1
/ \
48 55
One way this is typically handled is by creating a table linking parents and children:
parent | child
-------+------
5 | 62
5 | 32
5 | 7
62 | 9
... and so on
Because SQL isn't brilliant at recursion (and because most of us aren't as brilliant as Joe Celko), we find ourselves writing several SQL calls to walk up the tree. However, this becomes prohibitively expensive when you have millions of objects in your tree and you need to repeatedly traverse it. Enter materialized paths.
A materialized path is a database denormalization where you flatted each branch of the tree. You join all of the parents, in order, with a separator character which is guaranteed not to be in the id. So the materialized path for 48 might be 5.7.1.48.
node | path
-----+---------
1 | 5.7.1
3 | 5.62.3
5 | 5
7 | 7.32
9 | 5.62.9
29 | 5.7.29
32 | 5.32
48 | 5.7.1.48
55 | 5.7.1.55
62 | 5.62
They can be tricky to maintain, but if you get them right, you can find all parents IDs in a single SQL statement.
Today I ran into a nasty problem where I needed the parents, but I had to guarantee that the order of those parents was preserved at the database level (via ORDER BY). The SQL looks sort of like this:
SELECT *
FROM tree
WHERE id IN (5,7,1,48)
ORDER BY ...
Um, what goes in that order by? I need the items returned in the same order as the ids in the IN ids. I asked many people and they were stumped. They came up with Oracle specific answers. Some suggested writing the ids to a temp table with sort criteria and joining on that. Others through up their hands. One person suggested a complicated case statement.
Then one of my colleagues, Mark Morgan, said "can't you just ..."
The solution was rather counter-intuitive to me, but once I saw it, I was dumbfounded in its simplicity. Do you see it?
Update: I've posted the solution in the comments, so if you want to think about it a bit first, don't read 'em.
Dear Lazyweb, I can't figure this out. I'd like to do something like this:
perl -MPIPs -de '$s = PIPs->new->model("PIPs")->schema; $DB::single = 1'
That would theoretically enter the debugger and I could just hit 'c' and continue to that breakpoint, have my variable set and start debugging my app. Instead, I have the following nasty alias:
alias debug_pips='echo "\$s = PIPs->new->model('PIPs')->schema";perl -MPIPs -de1'
And I just cut-n-paste the echoed line into the debugger. Is there a better way?
A few years ago I started perl101.org. My tag line was "What every Perl programmer should know," and the site would cover a wide range of basic topics.
The goal is not to cover the topics in depth, but give just enough of a taste to give the reader an idea of how to do something, and enough to go off of to investigate further. It's an FAQ of sorts, but of techniques.
The idea came from having the same questions asked over and over again in IRC.
I also want to show examples of the Right Way to do things, like using chomp instead of s/\n$//. I wanted to tell people about tools like perldoc that I know many programmers don't even know exist.
I haven't done much with perl101 for a while, but I'd like to get it going again, fully awesome. I've put the code on github and I'm asking for contributions. Please pull a fork, add content to it, and let me know when your fork is ready to merge back to the master and I can republish. I'm specifically not making it a wiki, because I want to maintain editorial control to help keep it focused.
The site is also pretty ugly, and if you have better ideas for getting the content out there in an attractive way, I'd love to hear it.
There's also a perl101-talk mailing list and I welcome your participation there as well.
Let's put our collective heads together to help out newcomers to Perl!
Lately my pet project has been implementing delimited continuations for Perl. In an effort to keep myself motivated (it's hard and frustrating), I'll try and explain what I'm actually trying to do.
Caveat lector, the repository is nasty. There is no coherent process in the versioned history, it's just git commit -a at the end of the day, so the repository will be breaking. Secondly, I don't really know what I'm doing. I do in principle, but the details are still vague and I am discovering them as I go along.
So, WTF are delimited continuations? As Perl programmers the easiest way to understand them is probably Tom Moertel's favourite explanation, by Oleg Kiselyov.
As core hackers, the easiest way to understand them is to see how they are implemented. This is what I will stick with.
But really, the best way to understand continuations is to use a language that actually has them, and play around a bit. It's a strange concept to get used to. Unlambda, the INTERCAL of functional programming uses continuations precisely because they can be difficult to understand (though technically they aren't delimited).
Delimited continuations are structured around two operations, reset and shift.
reset is the delimiter, it saves a position in the call chain for a future call to shift.
shift is what actually creates the continuation. It does that by walking up the call chain from the current position, up to the reset, and saving everything in between into the continuation. The continuation itself is reified as a function, and is passed as an argument to the block given to shift. Hold that thought, we'll get back to that in a second.
This interaction between shift and reset is much like the one between die and eval. eval delimits the stack, and die is what's called an escape continuation:
eval {
warn "reached";
die $value;
warn "not reached";
};
return $@;
The die causes the code in the eval to be aborted, and sets the special variable $@ to $value. With shift and reset this could be expressed a little more elegantly as:
return cont_reset {
warn "reached";
cont_shift { return $value };
warn "not reached";
};
What happens in shift is similar to die. The stack is unwound all the way up to the begining of reset. The difference is that the return value of the shift block becomes the return value of the reset block.
Scope::Upper makes this sort of abstraction possible already, as demonstrated by Continuation::Escape.
Semantically eval/die and delimtied continuations are actually very different. shift doesn't unwind the stack, but stashes the frames into a data structure.
Delimited.xs introduces two structs, delim_t and cont_t.
delim_t is very trivial, it contains a snapshot of the various state variables in the Perl interpreter, like the stack pointers, the current opcode, etc. When we call cont_reset a new delimiter is pushed onto a linked list.
Inside cont_shift the init_cont function will create another delimiter, and then destructively move all the stack frames between the two delimiters into a cont_t. When it's done current state of the interpreter has effectively been rewound to that of the start delimiter, but unlike die none of the data has been cleaned up, those stack frames are still "live", just not visible from the current state of the interpreter.
The restore_cont function takes a cont_t and appends copies of all the captured stack frames inside it to the current state of the interpreter. There are many things to fix up, and that is where the bulk of the implementation lies.
At the end of the restore_cont the next step for the interpreter is to resume execution right after the call to cont_shift, the end of the continuation capture.
The difference is when those stack frames are evaluated and we reach the end of the cont_reset block, instead of returning to the caller of cont_reset we return to the caller of restore_cont. To do this we overwrite the top level PERL_CONTEXT stack frame to use the current PL_op as the retop, instead of the one it was originally created with.
So how does restore_cont get called? We wrap the cont_t in a closure, that gets invoked as a function. Here is an example of no-op continuation usage:
my $value = cont_reset {
return 3 + cont_shift {
my $k = shift;
$k->(7);
};
};
is( $value, 10 );
This is no-op because cont_shift unwinds the stack, and gets the continuation in $k, and then immediately reinvokes it, restoring everything and causing cont_shift to return a value of 7 from cont_reset. This code is functionally equivalent to:
my $value = cont_reset {
return 3 + 7;
};
Which is really the same as
my $value = sub { return 3 + 7 }->();
But $k is more than that, it's literally a function that appends a bunch of things to the interpreter's stacks. You can call it multiple times:
my $add_three = cont_reset {
return 3 + cont_shift {
my $k = shift;
return $k;
}
};
In this example cont_shift returns $k instead of invoking it. Since the stack has been unwound, this causes cont_reset to return $k.
So what happens when you invoke $add_three from the outside? The stack frames from reset until shift are appended, so the interpreter state is waiting for a value from shift to add 3 to.
This means that you can call $add_three->(7) and get back a value of 10, any number of times. In fact, this is what the basic sanity test does.
Unfortunately what goes on behind the scenes isn't that simple (almost nothing is when it comes to Perl internals).
delim_t is initialized by init_delim to capture relative offsets for PL_stack_sp, PL_markstack_ptr, and the values of cxstack_ix, PL_scopestack_ix, PL_savestack_ix and PL_tmps_ix (which are relative offsets anyway). This information captures all of the stack states.
In addition it keeps track of a number of variables (like the current opcode, PL_op, the current pad, PL_comppad, and another of other variables corresponding to the current state.
When we init_cont the start and end delim_ts are used to capture state in the cont_t using several functions.
init_cont_stack allocates buffers for the mark stack and value stack. All the stacked SVs are put into an AV, increasing their reference counts, and the marks are converted to 0 based offsets, based on the value of PL_stack_sp at the start delimiter.
init_cont_cxs captures Perl's context stack. This stack contains PERL_CONTEXT structures which contain all the information about the call chain as well as loops and other blocks requiring data.
In Perl the actual storage for lexical variables is not allocated on the context stack (though it probably should be), but instead its appended to CvPADLIST(cv), which is an array of arrays stored in the CV's data directly (this allows storage space to be cleared instead of freed, so repeated calls to the same subroutine are slightly faster).
The context stack is walked from the end to the begining, and any CXt_SUB, which denotes a subroutine call, causes the lexical pad instance of that subroutine to be popped off of CvPADLIST(cv) and stored in a pads AV in the continuation. CvDEPTH(cv) is adjusted appropriately.
The contexts also contain values like cx->blk_oldsp, previous offsets of the various stack pointers, and these, like the mark stack, are converted to be relative to the start delimiter.
init_cont_saves handles the save stack, and the associated scopestack (which is like the markstack of the savestack). The savestack is used to roll back the interpreter state when leaving a scope.
Any operation that would need to be undone is handled using the savestack, and dispatched using the ENTER and LEAVE macros.
our $foo = 1;
{ # calls ENTER
local $foo = 3;
} # calls LEAVE
ENTER pushes PL_savestack_ix onto the scope stack. This marks the offset of the savestack.
local $foo pushes a save entry onto the savestack with the previous value of $foo, which is the IV 1.
When the scope that called local is exited the LEAVE macro pops an entry from the scope stack, and then calls leave_scope (from scope.c) to dispatch all the SAVEt_* entries between PL_savestack_ix and the value of PL_savestack_ix popped off the scope stack, causing $foo to be set back to 1.
There are many types of save entries which are crucial to properly managing Perl's data structures and runtime semantics. For instance SAVEDESTRUCTOR_X(function, pointer) can be used to automatically call function on pointer when the current scope is left, and is very useful for managing data from XS modules.
To properly handle the semantics of this stack unwinding we have to partition the savestack into entries that should be recreated every time restore_cont is called (and therefore called zero or more times), and those which should be recreated when the continuation is garbage collected (and therefore only called once).
The entries that should be recreated every time are those that keep track of the currently active lexical pad (SAVEt_COMPPAD), those that keep track of stack pointers (SAVEt_STACK_POS and SAVEt_STACK_CXPOS), and those that clear lexical variables (SAVEt_CLEARSV, pushed to the savestack by pad_sv opcodes when assigning the initial value to a lexical variable).
Localization entries will be handled in the future, but are currently deferred.
Lastly, init_cont_state resets the state of the interpreter to the start delimiter and detaches the two delimiters from the linked list.
Now that we have a detached continuation in a cont_t, appending its contents to the running interpreter is what remains.
The process of invoking a continuation is pretty much the inverse of the init_cont functions. All 0 based offsets are added to the current values of the stack counters, the cont->end delim_t is used to set the values of the interpreter variables, etc.
The main things that are done in addition is cloning of pads (and other state), and fixups at the top level context entry.
Pad cloning is done to ensure that each recreated instance of the continuation has its own copies of the lexical variables it uses. Without this the repeated invocations of the continuations will get in each others' way. Unfortunately this is pretty tricky, we have to walk the context stack top down and clone various values, keeping the addresses in a pointer table so that everything points back at the right thing. This is the most broken bit of code right now.
Similarly, SAVEt_COMPPAD entries have to be mapped as well, in order to reset the interpreter to the cloned pads, not the pads captured in cont_t.
Though not implemented, faked @_ arrays would need to be recreated with the duplicated value stack, and things like loop counters for foreach blocks also need more intricate copying.
Finally, the top level PERL_CONTEXT that was copied has to have its retop set to PL_op->op_next, where PL_op is the one making the call to the reified continuation. This causes the cont_reset block to return to where the caller invoked the continuation, properly connecting the call chain.
Tests are failing. Shit is segfaulting. Sanity is fading. But it works, sort of. This is a proof of concept, and now I have to finish off the various details to make it into a usable module.
If you'd like to help and have no core-fu, feel free to write failing tests. Anything you'd like to do with delimited continuations should be tests.
If you'd like to help and do have core-fu please review my horrible C code, point out obvious mistakes, or even implement some the missing bits.
I've been previously told that continuations aren't possible in Perl 5, but I think this has been demonstrated to be a false. Anyway, $title->().
As I mentioned yesterday, I now have a new shiny analysis toy in the form of downstream dependency graphing.
So now we can tackle some issues that have been hard to get traction on until now.
The first one I'd like your help dealing with is some fallout from Kwalitee and Perl::Critic, unintended consequences of superficially positive metrics targeted by contestants in a game without access to clear best practices resulting in damage to the CPAN graph.
(Don't get me wrong, I'm a huge supported of CPANTS and Dom did a lot to help me with it, including creating the original version of the awesome red and green box author page basically to my spec. This is just about two particular metrics)
The two big problems are caused by Test::Pod and Test::Pod::Coverage.
It's not so much the use of them at all, since they are quite safe to use when applied with author test best practices.
It's that many (alas, very many) people used these modules in ways that made their use compulsory, imposing hard dependencies instead of soft dependencies (only running them if already installs).
Test::Pod imposes a Perl 5.8 dependency, blowing any chance at back-compatibility for anything that uses it.
And Test::Pod::Coverage has a history of failing to get to 100% PASS rates, the current release has 30 odd failures.
So, what's the shape of the problem we've got here?
Let's start with Test::Pod::Coverage.
http://svn.ali.as/graph/Test-Pod-Coverage
(Don't forget to use CTRL-minus to zoom out) :)
As you can see, there are a couple of large nasty tangles of dependencies, and a number of high-dependency explosions.
And even though Test::Pod::Coverage has 650 dependencies (resulting in 18,500 FAILure points) it's obvious from the graph that several hundred dependencies can be removed by fixing just three modules
Test::Compile
Module::Find
WWW::Search
There's one more critical high-dependency node there (Test::Distribution) but as a Test:: module depending on another Test:: distribution, it's far more likely it is a legitimate dependency.
But even without that last one, we've got some very juicy low-hanging fruit to go after here. And once these three have been fixed, we can rerun the graphs and see if anything else stands out.
So now we move on to Test::Pod, which is bigger again.
http://svn.ali.as/graph/Test-Pod
As with Test::Pod::Coverage, the three previous modules (Test::Compile, Module::Find and WWW::Search) show up again.
And this graph also adds some more interesting targets.
MooseX::Types::Structured
MIME::Types
MailTools
Test::Strict
So that's no more than around 7 modules that we need to tweak and rerelease to unwind half or more of the dependencies, and reduce the dependency weight of many of the Heavy 100 by 5 dependencies each.
By doing this reduction of false dependencies, we reduce the time it takes to install CPAN software, and reduce false failures due to failing dependency modules for things that never should have been dependencies at all.
One of the ways to ease the learning of Perl for those the none English speakers is to have translated documentations and tutorials for them. Even if they know English it is easier if they don't have to learn something in a foreign language. We already have some of the Perl documentation translated to a few languages but IMHO they have not received enough attention.
So let me announce the first version of perldoc.org.
I'd like to thank Tiago Stock, the owner of that domain who let me use it for the advancement of Perl.
My plan is to help the people who would like to have a translated version of the Perl documentation available on the web.
Currently there are two sites the French and the Italian version.
The site is built using the code that John Allen wrote for the previous version of perldoc.perl.org and released more than half a year ago.
Still many features of the original site are missing but I did not want to wait more with the announcement. I hope some people will be interesed and start to improve the code so we can have all the features back soon. The code is now on Github. Please fork it.
The documentations themselves are maintained by other people. I am just using the packages released in the POD2::* namespace.
I’m going to be speaking at the OSS BarCamp in Dublin in September. Given the likely audience I think my Perl Myths talk would be a good fit.
It needs updating though, and that’s where you can help…
It’s a strongly fact base talk with lots of graphs, numbers, and references. A lot has happened in the Perl community since I wrote the talk over 18 months ago!
Lots of the information needs to be checked and updated. The jobs graphs are easy, as are the basic CPAN numbers.
I’d like to get updated versions of: the graph of CPAN uploads (slide 23), test counts for perl 5.10.1 (slide 44), the graph of cpan-testers reports (slide 47).
One item that’s not in the slides, which were written in Feb 2008, but is in the screencast made later at OSCON (at 7:00) is the statement that “One quarter of all CPAN distributions have been updated in the last 4 months! Half of all in the last 17 months!” That’s an impressive and important statistic but it need updating. I think I got that from Schwern’s excellent “Perl is unDead” talk, but I don’t know how he got the figures or, more importantly, how to update them. (And schwern.org seems down so I can’t get the slides.)
At OSCON I also talked about myths relating to Perl 6 (again, not in the old slides, but starting at 19:20 in the screencast). I’d like to cover Perl 6 when I talk in Dublin in September, but the status of Parrot and Perl 6 has changed in the last 18 months even more than Perl 5. So I need help with good sources of hard data for Parrot and Perl 6, similar to what I have in the screencast but updated.
More generally, I’m also looking for new sources of information — hard data about the vigourous health of perl and its community. Have you seen any relevant talks/slides or blog posts you can recommend?
Can you help with any of the above?
Thank you in advance!
Posted in ireland, perl Tagged: myths, perl, perl6, presentation
The Perl 6 design team met by phone on 29 July 2009. Larry, Allison, Patrick, Nicholas, and chromatic attended.
Larry:
:trig tagPatrick:
Larry:
~~ could be two string prefix operators, but that's uselesssay ~~$something, for exampleAllison:
Patrick:
Allison:
c:
Allison:
c:
Allison:
pbc_to_exe Patrick:
Code blocksmake working from an installed Parrotmake install target in the ins2 branchc:
Nicholas:
Patrick:
Larry:
Patrick:
Larry:
when 1, when 2, when 3... it's matching numericallyNicholas:
Patrick:
Larry:
grep is doingPatrick:
Larry:
grep's first argument doesn't have to be a block thenc:
grep in Perl 5 if the implementation were similarNicholas:
After less hacking than I'd originally thought, I've gotten a reasonable reverse of my dependency graphs working now in CPANDB.
So, I'm proud to present my favourite example.
Note that it is somewhat larger than new kid on the block The Universe of AnyEvent
I'm happy to report that the protection phase of security operations on the CPAN has been completed, taken in response to evidence of significant password sharing amongst the publicly exposed subset of the stolen passwords.
To prevent the need for resetting passwords for all users, Andreas Koenig has instead obtained the entire set of 51,105 PerlMonks passwords, hashed, against the entire set of all 4,096 possible salts.
After checking for all 209,326,080 possible colliding password hashes, we have locked out 767 CPAN authors who either shared their password with PerlMonks or co-incidentally shared a password with a different PerlMonks user.
An explanation of the situation and a password reset keys for these accounts have been generated and sent to both the public cpanid@cpan.org address and secret email address for these users. Where an address has been changed, we have also sent the password reset key to their secret email address as it existed at a time before the original break in.
With the exposed password database now rendered unusable in any future attacks, we can now move into an auditing phase.
This will include a number of steps including secondary checks on authors with suspicious changes in details during the potential attack window, and checks on any suspicious uploads during the attack window.
Because no suspicious uploads or reports of account hijacking have been received at this time (and because the auditing task is somewhat larger than the protection task) we will be doing this gradually over the next several weeks.
I will report in again once this task is completed.
I'd like to thank James Mastros and the other PerlMonks admins for producing and providing access to the (rather large) hash list for us.
Carl Masak looks at Perl 6 development from a unique angle.
On the way to lunch today, I queried two collegues about Eve Online, their favourite way to spend evenings and weekends:
<masak> So, are people confused by all the detail when they first
join?
<jonalv> Of course.
<masak> And how do you... handle that? How do you make it easy and fun
for people to get up to speed?
<jonalv> Well, people tend to find the fun in it themselves. As to
getting up to speed, they simply have to start in one end and unwind the whole
thing.
<Gpox> It is complicated. There are occasional jokes about the
learning curve
involved.
<jonalv> Nowadays people have the wiki to help them get started as
well. There wasn't one when I joined.
I think the parallels to the Perl 6 community are so clear that I don't even need to point them out.
I'm sure there are many differences between multiplayer online games, but the
social dynamics seem very similar to what we have over at
#perl6. Which brings me to today's thesis:
The Perl 6 community has the structure of a MMORPG.
We have just as much fun, and — what's perhaps more important — the bits we're putting together won't end up being owned by Blizzard Entertainment or CCP Games. Instead, we're building something that we ourselves and others will find directly useful during a large part of our lifetimes. I'm not out to devalue online multiplayer roleplaying games, just pointing out that there's perhaps an extra dimension to the acheivement we get to feel in the Perl 6 community.
Just for the heck of it, let's take the analogy one step further.
Ahem.
So, you want to join the Perl 6 community? If so, congratulations! You're about to embark on a quest that will seriously impact your day life, your habits and your sleep patterns. You will be building things, fighting unknown dangers, and communicating with your peers in order to make sure people are maximizing the benefit to the team and to the quest.
Here are the different classes you might want to occupy:
Warrior
One who actually fights the creatures out there, who braves new, possibly lethal dangers every day. In short: a Perl 6 application writer.
Being a warrior requires strength (you will have to hit the keyboard a lot when writing all the code, and the code will mercilessly hit you back... often), some dexterity (because working around known bugs requires adaptability) and perhaps a bit of charisma (should you want to collaborate with people, it's good to be pleasant).
We're currently on the lookout for warriors, so if you think you have it in you, be sure to stop by #perl6 and help us equip you for the perilous quest you have before you.
Priest
The priest is important for the group, but doesn't put himself in the way of direct danger like the warrior. Instead, they perform vital tasks with their hands at a safe distance. In the Perl 6 world, priests submit bug tickets, write tests and answer newbie questions on the #perl6 channel.
Being a priest requires intelligence (because their job is mainly to increase the collective knowledge of the group), constitution (because it's a lot of work) and perhaps a bit of charisma (so we don't bite the newbies).
I must say that I admire the priests. The impressing thing isn't in the incantations themselves, which are individually quite basic. The impressing thing is that their favour from the gods doesn't seem to run out. moritz++ and KyleHa++ especially come to mind. I'm sure there are slots for a few more priests too.
Mage
In the flurry of activity during a quest, the mages are the ones who wield forceful spells in the form of new features in our implementations. The mages know some pretty hefty incantations, but they speak in codes (like "Haskell", "Parrot" or "Lisp") so that us mere non-initiates can only stand by in admiration when they get going.
Being a mage requires wisdom (because, hey, you're implementing a compiler), intelligence (in figuring out how to fix the bugs and still pass the test suite), and constitution (because it's a lot of work).
Learning to be a mage looks nigh-impossible, but our mages assure us that there's actually not that much to it. And the rewards are of course enormous: you get to have an incantation of yours immortalized in millions of installations around the world.
Healer
Some people are in the group to make sure the group is doing well, and that no-one is critically low on hit-points. Discussions can sometimes get heated or sharp, by which point we're very glad to have the people around who are specially trained to see beyond the ego and help us focus on the important parts of the picture. We simply need to be reminded at times, that we're (as S01 expresses it) "a bunch of ants all cooperating (sort of) to haul food toward the nest (on average)". We don't need to agree always on everything, but keeping the group coherent is important, and healers do their magic in the background to help us with that.
Being a healer requires wisdom (to know where and when to intervene),
intelligence (to know to pick the right soothing words), and plenty of
charisma (to get it across that -Ofun is what we're
optimizing for).
I think many people jump in as healers now and then during our quest, but we also have one permanently employed high-level healer among us. TimToady++.
Archer
As for long-distance influence, and reaching outside of our own circles, the archer fills the important role of blogging, tweeting, iron-manning and generally making a positive noise about Perl 6, which can reach people far away.
Being an archer requires strength (again, typing is hard, and commenters/other bloggers might shoot back at you), and dexterity (you need to be fluid in expressing yourself). Having intelligence, wisdom and charisma — something I as a simple warrior often find myself wanting more of in my blogging — is not a necessity, but sure helps in getting your point across.
In conclusion: LFA, n00bs?
The Perl 6 community is on a pretty awesome quest. If you'd like to enter as one of the above classes, you're sure to find your place as a productively contributing members. Filling several roles is perfectly fine too, but people tend to find their specialties and favourites.
As to when we'll be "done", who knows? When is an MMORPG "done"? All we know is the quest we're on until April is our most ambitious yet, and that it's guaranteed to be an interesting set of months on #perl6 until then.
(Careful postscript: I know that Eve Online might not be a role-playing game in any strict sense, so if you plan to write a comment that amounts to that, consider writing "Your post made me go download Rakudo" instead. The Eve Online bit was mostly how I got to thinking along these lines in the first place.)
Carl Masak has been programming Perl since 2001. He found Perl 6 somewhere around 2004, and fell in love. Now developing November (a Perl 6 wiki), Druid (a Perl 6 board game), and proto (a Perl 6 package installer), Carl is a heavy user of and irregular committer to Rakudo.
As I already wrote about it, there was this annoying question on the Marketing BOF on YAPC::EU on how can we measure success of the promotional or marketing project we are planning to do.
It is still unclear to me what are the objectives of others and I am sure some people will even say, that Perl does not need any promotion but the presence of those who came to the Marketing BOF confirmed that I am far from being alone in my thinking that Perl needs promotion.
I think I am also on the ambitious end of those who think we need more marketing. Some would say I am just a dreamer. So let me do just that here.
We need some measurable objectives and a time frame to achieve those. I mentioned some of the general objectives in my previous post, let me try to describe a few specific and mostly measurable objectives along with a target date. We can of course add more such objectives and discuss these too and set all of them to some reasonable value.
As we cannot really be sure which of the following is achievable at all IMHO we should say that by a certain date we would like to achive some of these objectives.
Let's say by end of 2012 we would like to achieve at least 3 out of the following targets:
| Skill | 30 days | 90 days | 1 year |
| Java All Jobs | 5,430 | 15,502 | 84,736 |
| PHP All Jobs | 2,150 | 6,395 | 35,772 |
| Perl All Jobs | 1,342 | 3,859 | 22,229 |
| Python All Jobs | 666 | 1,841 | 9,134 |
| Ruby All Jobs | 333 | 1,023 | 4,871 |
| Language | 2008 | 2007 |
| C# | 272 | 232 |
| Java | 211 | 242 |
| PHP | 173 | 159 |
| Visual Basic | 72 | 100 |
| Ruby | 61 | 96 |
| Python | 60 | 46 |
| Perl | 29 | 38 |
| Lua | 11 | 2 |
| Rank | Site | Comment |
| 9,605 | cpan.org | 92% search.cpan.org, 5.4% cpan.org, 2% kobesearch.cpan.org |
| 18,458 | perl.org | (50% went to perldoc.perl.org 18% went to perl.org see) |
| 21,231 | perlmonks.org | |
| 38,831 | perl.com | |
| 273,104 | perlfoundation.org |
In order to compare I searched for a couple of other keywords and found these:
| Rank | Site | Comment |
| 931 | php.net | |
| 975 | java.com | |
| 6,736 | python.org | |
| 7.095 | php.de | |
| 11,329 | rubyonrails.org | |
| 19,007 | selfphp.de | |
| 30,929 | ruby-lang.org | |
| 70,586 | diveintopython.org | |
| 74,125 | php.ru |
I am sure we have to invest more energy in locating other representative web sites and in analysing this data. We then should set a target ranking for the perl related web sites. Getting two perl related web sites in the top 5000 sites should be a good target.
We need to invest more energy in finding metrics that we think can be interesting and we should decide on some objectives that are reasonable to reach in a few years.
Clearly, with an inflammatory subject like that you're probably expecting a false question, and I quickly in my journal post say "Yes!" or "No!".
I am, clearly, not pro-git. But I'm actually asking a real question :)
Git still has problems being Windows friendly (most solutions still make you act in a very unix-like manner) and so it's still not on my radar to move any time soon.
But I keep seeing one particular pro-git (and pro-distributed in general) argument come up over and over again in git discussions, the argument that git helps encourage contribution.
And THIS question is one I'm very interested in, it's something that might actually push me towards switching.
I much preferred CVS for it's better branch logic, but because svn makes it so much easier to contribute switching to SVN resulted in improved contribution rates despite the shitty branching.
Has anyone actually measured rate of change and rate of contribution across a SVN to Git conversion?
For my repository, I have quite a good understanding of how many people are contributing.
http://www.ohloh.net/p/3299/contributors
Here's a better one showing contributors over time, compared to the (epic) contributor growth rate of Padre.
Is it possible to establish via actual metrics that the diversity of contributors and rate of change flowing into the final production CPAN releases is increased relative to the rate and diversity of change prior to the move?
Have you done it? Can I see the comparative results?
Does the theoretical improvement to contribution result in a real world improvement in contribution? Or does it suffer from the "You are not Linux" scaling problem, that our projects mostly just aren't big enough to benefit?
These links are collected from the Perlbuzz Twitter feed. If you have suggestions for news bits, please mail me at andy@perlbuzz.com.
Too often I see user-facing text that includes far more detail than is necessary. My current example is this web app where I can upload files for my expense reports. The help screen tells me:
You can attach the following types of files:
- Portable Document Format (.pdf)
- Joint Photographic Experts Group (.jpg or .jpeg)
- Graphic Interchange Format (.gif)
- Tagged Image Format (.tif)
- Windows OS/2 Bitmap Graphics (.bmp)
Why why why do they bother giving those unnecessary explanations of what the file extensions mean? What they really mean is:
You can attach .pdf, .jpg or .jpeg, .gif, .tif and .bmp files.
Has anyone ever said "Hey, Bob, can you send me that Joint Photographic Experts Group file?" No? Then why do you think that people want to read it?
Since modules is taken, I was thinking about packages to do this:
#use Spreadsheet::ParseExcel::Workbook;
#use Spreadsheet::ParseExcel::Worksheet;
#use Spreadsheet::ParseExcel::Font;
#use Spreadsheet::ParseExcel::Format;
#use Spreadsheet::ParseExcel::Cell;
#use Spreadsheet::ParseExcel::FmtDefault;
use packages 'Spreadsheet::ParseExcel' => qw(
Workbook
Worksheet
Font
Format
Cell
FmtDefault
);
No idea if people would really care, but I like the cleaner syntax. Or maybe as a new feature with 'aliased':
use aliased 'Spreadsheet::ParseExcel', import => [qw(
Workbook
Worksheet
Font
Format
Cell
FmtDefault
)];
Seems an uncomfortable overloading of aliased, but it could be handy.
So I've been hacking on a pet project and thought that Freebase would be my answer. As far as I can tell, it's not. Not even close. Right now, Freebase is like a huge Wikipedia, but with a nice query language on top. I needed a list of all countries in the world along with basic stats like capital, population, GDP, official language, etc. Here's the script I hacked together:
use strict;
use warnings;
use WWW::Metaweb;
my $mh = WWW::Metaweb->connect(
server => 'www.freebase.com',
read_uri => '/api/service/mqlread',
trans_uri => '/api/trans',
pretty_json => 1
);
my $countries = '[{"type":"/location/country","name":null}]';
my $result = $mh->read( $countries, 'perl' ) or die $WWW::Metaweb::errstr;
my @countries = sort map { $_->{name} } @$result;
# http://www.freebase.com/app/queryeditor
my %country_stats;
for my $country (@countries) {
my $country_info = sprintf <<' END' => $country;
[{
"type": "/location/country",
"name": "%s",
"capital":null,
"currency_used": [],
"form_of_government": [],
"gdp_nominal" : [{"timestamp":null,"currency":null,"amount":null}],
"gdp_nominal_per_capita" : [{"timestamp":null,"currency":null,"amount":null}],
"/location/statistical_region/population" : [{"number":null,"timestamp":null}],
"official_language":[{"name":null}]
}]
END
print "Reading the data for $country\n";
my $result = $mh->read( $country_info, 'perl' )
or die $WWW::Metaweb::errstr;
use Data::Dumper;
$Data::Dumper::Indent = 1;
$Data::Dumper::Sortkeys = 1;
print Dumper($result);
}
Not only do I get only 100 countries returned -- including the Weimar Republic and West Germany (but not East Germany) -- most of whom have almost no data associated with them. The ones which do have data often have curious results which might be correct (see the official languages), but without context, who knows? Oh, and WWW::Metaweb needs a monkey patch to get around an incompatible API change in JSON::XS. One suggestion on the Freeweb message boards involved posting back the correct information. This sounds reasonable, but at the end of the day, it also sounds like a lot of work, particularly since I didn't want to base my project on Freebase. I just saw it as a useful source of information. Freebase looks awesome, but it's not quite there yet. Or I don't understand it. Who knows?
I'll have to figure out a better way of extracting this information (CIA World Factbook sounds good), but then figuring out the posting API for Freebase just sounds like more work that will distract me from my main project.
Back to the drawing board.
Marcus Ramberg responded to my post on How I Use Catalyst, and I'd like to respond to a few points he made.
Marcus wrote:
I disagree that $schema->resultset('Person') is a significant improvement on $c->model('DBIC::Person').
Me too! I don't think the former is a significant improvement over the latter. They are, after all, more or less the same. The one big problem is that the latter version uses a nonexisting DBIC::Person namespace. There are no DBIC classes anywhere in the app. I think the model version would be much better if it was just written as $c->model('Person').
Marcus also points out that the model layer lets you configure multiple models and access them in a unified way. That is indeed nice. Unfortunately, that has the problem of tying yourself to Catalyst's config, which is problematic for reasons I already described. Similarly, the unified layer only exists inside Catalyst, which is really only accessible during a web request. So now we're stuck with recreating all of this if we need to access our models outside of a web request.
The long-term Catalyst roadmap includes the much-talked-about application/context split. Once this is done, presumably you will be able to access the application, which I take to mean config and models, outside of the context (of a web request). Once that is in place, I think many of my objections will go away. Unfortunately, for now I have to write my own application/context splitting code.
I'll write more about this later on my personal blog, but I live in Dolphin Square, a place with a long, illustrious history. Now that I'm engaged, they've made it very clear that my fiancée cannot live with me (she has a cat) and I cannot live with her (I have a lease). They've patiently explained that I signed the papers and they can put my flat on the market along with several other flats they already are having trouble renting, but no, they are not willing to budge one inch. The little human touch about actually caring about people has been lost somewhere, but given my experience with Dolphin Square, this is no surprise.
Had this been simply a contractual matter, I might have just accepted that things like this happen in life and my fiancée and I will have to wait a while before we can live together. However, Dolphin Square is a shiny façade with precious little behind it (aside from, say, tiny issues like that pile of feces they left festering in a stairwell for several days or refusing to clean my flat before I moved in), so I effectively feel like the quality of living they've promised and the actual abysmal reality has left me robbed. The lease is the last straw. Hopefully I'll get a long, good rant going on my LJ and push it up near the top of Google.
There's a fair bit of discussion about version numbers and people get confused by them. Get confused no more. Here's the way you should be setting version numbers in your modules:
our $VERSION = "0.001"; # or "0.001_001" for a dev release
$VERSION = eval $VERSION;
dagolden goes in extreme detail as to why you need this and what led to this state of affairs.
A thread on Ubuntu Forums included the absurd assertion that Perl 6 has missed the boat.
This cannot be true, simply because there is no boat.
Perl 6 is not, and never has been, about capturing market share, or fitting a given niche, or becoming the pre-eminent tool of choice for a given problem area.
Larry says "Perl 6 isn't about missing boats, but about inventing airplanes."
I'm finally working on my "handy responses for Perl 6 sniping" cheat sheet. I'll post it soon!
The Perl 6 design team met by phone on 15 July 2009. Larry, Allison, Patrick, and chromatic attended.
Larry:
use versus need Captures versus Parcels behavePair argument becomes a mandatory, named parameterPair smartmatch was too ambitiousPair argumentStatBuf typeAllison:
Patrick:
Allison:
Patrick:
-G bugs which afflicted Rakudoisa c:
Patrick:
Larry:
Patrick:
Larry:
Patrick:
Larry:
Thanks to Dave Cross and Ovid for organizing the marketing BOF during YAPC and the notes Ovid made later. The fact that there were more than 50 people in the room made it clear people are interested in the subject.
Unfortunately that number of people made it impossible to have a discussion but I think just the presence of so many people show we have to move on. I hope some of them will come forward and write about their opinions. I think the more people talking about this the better.
As I don't know the names of most of the people present let me just write down some of the objectives people mentioned. I think it ranged from
Make sure Perl is not banned in my organization
through
Bringing the excitement back to Perl
all the way to
Turn Perl into an accepted application development language such as Java or C#.
(this was me).
None of these are really measurable objectives so we will need to find a few of those as well but I think this is a good start. I'd be glad to see what other objectives people can write down. So go ahead, please add your comments.
Ovid kept mentioning the market research and Richard Dice suggested a matrix idea that sounded quite good. I am much more convinced now in the need of some such market research and I hope Ovid will be able to take upon himself and push it forward or hand it over to someone who will do it in a reasonably short timeframe.
Some people suggested all kinds of actions that seem to be totally reasonable regardless of any market research. Some of the ideas were
I have tons of things to say about the conference, the talks and the people I met there but let me just say this:
Thank you! All of you! It was great!
There were more than 300 people in the conference and more than 100 talks.
I attended many good talks and many more I missed as they were overlapping with each other or because I was on the hallway track.
Thanks for the great talks!
In between the talks - and sometimes instead of them - on the hallway track I managed to talk to many people. I met several friends from the previous conferences but also many people who I did not know earlier or who I only new by e-mail address or by IRC nickname.
Lots of interesting discussions, lots of good ideas, I'll about those later.
See you next year in Pisa or at an earlier Perl workshop or conference.
My version number article exhausted me, so this will be a short post. My work to convert some more Perl module repositories from Subversion to git continues with two more things hosted on github:
MRO::Magic is a system for writing your own method dispatcher in Perl. I have written about MRO::Magic before, but it's rushing toward being useful as 5.10.1 rushes toward release.
Right now, I have two weird issues. Neither is a real blocker, but both concern me because they really highlight the hairiness of doing this sort of thing.
The first is that in one case, the following line emits an "Uninitialized value in subroutine entry" warning:
$class->UNIVERSAL::isa('Class');
In all cases in the test, $class is the string "Class" and in all but one
case, there is no warning. What? I don't know if this matters, but it's
confusing, and makes me feel like my own code is voodoo, which is not a feeling
I want.
The other case is more troublesome, as it is likely to drastically affect performance.
Because we're altering the way that method resolution works, we need to manage
our own method caching, which we can do with the "mro" module. I tried to use
mro::method_changed_in on the classes being magicked, but it failed. It
looked like I'd need to alter superclasses, so I just called the method on
UNIVERSAL and things worked. Of course, this means that any time you called
a method on an MRO::Magic class, you'd clear you entire method cache. Oof!
Today I tried injecting a new, empty package into the @ISA of magicked
classes and marking it as having changed. This got me neither the
pre-UNIVERSAL-clearing behavior (of failing tests) nor the UNIVERSAL-clearing
behavior (working). Instead, I got deep recursion.
Waah?!
If anyone is feeling brave and wants to have a look at MRO::Magic on 5.10.1 RC1, I would really appreciate it!
#!/usr/bin/env perl
use Mojolicious::Lite;
get '/' => sub {
my $self = shift;
$self->render_text('Yea baby!');
};
shagadelic;
There is also a helper script to generate a small example application.
% mojolicious generate lite_appAll the normal Mojolicious script options are available from the command line.
% ./myapp.pl daemon Server available at http://127.0.0.1:3000. % ./myapp.pl daemon --port 8080 Server available at http://127.0.0.1:8080. % ./myapp.pl daemon_prefork Server available at http://127.0.0.1:3000. % ./myapp.pl cgi ...CGI output... % ./myapp.pl fastcgi ...Blocking FastCGI main loop...The shagadelic call that starts the Mojolicious script system can be customized to override normal "@ARGV" use.
shagadelic('cgi');
Routes are basically just fancy paths that can contain different kinds of
placeholders.
# /foo
get '/foo' => sub {
my $self = shift;
$self->render_text('Yea baby!');
};
All routes can have a name associated with them, this allows automatic
template detection and back referencing with "url_for".
Names are always the last argument.
# /
get '/' => 'index';
# /foo
get '/foo' => 'foo';
# /bar
get '/bar' => sub {
my $self = shift;
$self->render_text('Hi!')
} => 'bar';
__DATA__
@@ index.html.epl
<a href="<%= shift->url_for('foo') %>">Foo</a>.
<a href="<%= shift->url_for('bar') %>">Bar</a>.
@@ foo.html.epl
<a href="<%= shift->url_for('index') %>">Home</a>.
Templates can have layouts.
# GET /with_layout
get '/with_layout' => sub {
my $self = shift;
$self->render('with_layout', layout => 'green');
};
__DATA__
@@ with_layout.html.epl
We've got content!
@@ layouts/green.html.epl
<!html>
<head><title>Green!</title></head>
<body><%= shift->render_inner %></body>
</html>
Route placeholders allow capturing parts of a request path until a "/" or
"." separator occurs, results will be stored by name in the "stash".
# /foo/*
get '/foo/:bar' => sub {
my $self = shift;
my $bar = $self->stash('bar');
$self->render_text("Our :bar placeholder matched $bar");
};
# /*something/foo
get '/(:bar)something/foo' => sub {
my $self = shift;
my $bar = $self->stash('bar');
$self->render_text("Our :bar placeholder matched $bar");
};
Relaxed placeholders allow matching of everything until a "/" occurs.
# GET /hello/*
get '/hello/(.you)' => sub {
shift->render('groovy');
};
__DATA__
@@ groovy.html.epl
Your name is <%= shift->stash('you') %>.
Wildcard placeholders allow matching absolutely everything, including
"/" and ".".
# /hello/*
get '/hello/(*you)' => sub {
shift->render('groovy');
};
__DATA__
@@ groovy.html.epl
Your name is <%= shift->stash('you') %>.
Routes can be restricted to specific request methods.
# GET /bye
get '/bye' => sub { shift->render_text('Bye!') };
# POST /bye
post '/bye' => sub { shift->render_text('Bye!') };
# GET|POST|DELETE /bye
any [qw/get post delete/] => '/bye' => sub {
shift->render_text('Bye!');
};
# /baz
any '/baz' => sub {
my $self = shift;
my $method = $self->req->method;
$self->render_text("You called /baz with $method");
};
All placeholders get compiled to a regex internally, with regex constraints
this process can be easily customized.
# /*
any '/:bar' => [bar => qr/\d+/] => sub {
my $self = shift;
my $bar = $self->stash('bar');
$self->render_text("Our :bar placeholder matched $bar");
};
Routes allow default values to make placeholders optional.
# /hello/*
get '/hello/:name' => {name => 'Sebastian'} => sub {
my $self = shift;
$self->render('groovy', format => 'txt');
};
__DATA__
@@ groovy.txt.epl
% my $self = shift;
My name is <%= $self->stash('name') %>.
All those features can be easily used together.
# /everything/*?name=*
get '/everything/:stuff' => [stuff => qr/\d+/] => {stuff => 23} => sub {
shift->render('welcome');
};
__DATA__
@@ welcome.html.epl
% my $self = shift;
Stuff is <%= $self->stash('stuff') %>.
Query param name is <%= $self->req->param('name') %>.
Here's a fully functional example for a html form handling application using
multiple features at once.
#!/usr/bin/env perl
use Mojolicious::Lite;
get '/' => 'index';
post '/form' => 'form' => sub {
my $self = shift;
my $groovy = $self->req->param('groovy') || 'Austin Powers';
$groovy =~ s/[^\w\s]+//g;
$self->render(
template => 'welcome',
layout => 'funky',
groovy => $groovy
);
};
shagadelic;
__DATA__
@@ index.html.epl
% my $self = shift;
% $self->stash(layout => 'funky');
Who is groovy?
<form action="<%= $self->url_for('form') %>" method="POST">
<input type="text" name="groovy" />
<input type="submit" value="Woosh!">
</form>
@@ welcome.html.epl
% my $self = shift;
<%= $self->stash('groovy') %> is groovy!
<%= $self->render_partial('menu') %>
@@ menu.html.epl
<a href="<%= shift->url_for('index') %>">Try again</a>
@@ layouts/funky.html.epl
% my $self = shift;
<!html>
<head><title>Funky!</title></head>
<body>
<%= $self->render_inner %>
</body>
</html>
Conditions such as "agent" allow even more powerful route constructs.
# /foo
get '/foo' => (agent => qr/Firefox/) => sub {
shift->render_text('Congratulations, you are using a cool browser!');
}
# /foo
get '/foo' => (agent => qr/Internet Explorer/) => sub {
shift->render_text('Dude, you really need to upgrade to Firefox!');
}
Formats can be automatically detected by looking at file extensions.
# /detection.html
# /detection.txt
get '/detection' => sub {
my $self = shift;
$self->render('detected');
};
__DATA__
@@ detected.html.epl
<!html>
<head><title>Detected!</title></head>
<body>HTML was detected.</body>
</html>
@@ detected.txt.epl
TXT was detected.
External templates will be searched by the renderer in a "templates"
directory.
# /external
any '/external' => sub {
my $self = shift;
# templates/foo/bar.html.epl
$self->render('foo/bar');
};
Static files will be automatically served from the "public" directory if it
exists.
% mkdir public % mv something.js public/something.jsTo disable debug messages later in a production setup you can change the Mojolicious mode, default will be "development".
% MOJO_MODE=production ./myapp.plLog messages will be automatically written to a "log/$mode.log" file if a "log" directory exists.
% mkdir logFor more control the Mojolicious instance can be accessed directly.
app->log->level('error');
app->routes->route('/foo/:bar')->via('get')->to(callback => sub {
my $self = shift;
$self->render_text('Hello Mojo!');
});
In case a lite app needs to grow, lite and real Mojolicous applications
can be easily mixed to make the transition process very smooth.
package MyApp::Foo;
use base 'Mojolicious::Controller';
sub index { shift->render_text('It works!') }
package main;
use Mojolicious::Lite;
get '/bar' => sub { shift->render_text('This too!') };
app->routes->namespace('MyApp');
app->routes->route('/foo/:action')->via('get')
->to(controller => 'foo', action => index);
shagadelic;This has been a good week for me to take things that look to big and break them into smaller, attainable goals. I did it for some work projects, and I think it's helping me feel like I can start making more progress. It also led to a huge update to Path::Resolver, including much-needed documentation.
Now I'm doing the same the same thing with some personal projects, both code and otherwise. I'm tired of feeling boxed in by huge projects that really just need a lot of small steps taken.
My initial list of goals was:
...with some optional stuff I'd like to get done.
The first one I did already, pretty early on. I also realize that I did the final one during OSCON by producing Config::MVP, which makes the Dist::Zilla configuration generic enough for anyone to use. (Maybe I should add more documentation, though.)
So, what's left? Tests, "operating," and two specific weavers. Tests are sort of uninteresting. They'll happen. The specific weavers are also sort of obvious. Those two, especially, are very hand-wavey. Everything interesting comes down to "Pod::Weaver should work."
What a stupid goal! What was I thinking?
Tomorrow, I will start sketching out the specifics of what that means, how Pod::Weaver will be used, and how I will get it into an operational state.
YAPC::EU 2009 microreport
I'm still travelling, and so I don't have a reliable sources of connectivity, time, or caffeine. As such, this report is much briefer than I would like.
I made it to YAPC::EU 2009 and survived. As my first YAPC ever, it was great to be at a conference where I could assume that everyone knew Perl. It was also great to meet a number of the people who I'd been working with on-line for years, but never met in person.
It seems that lots of people use autodie. That's good, because it's hard to gauge feelings and reactions on-line, but it's easy when a small group grabs one after a talk and asks detailed questions about its internals. There seems to be a lot of demand for autodie to provide the ability for third-party code to see if it's enabled, and also a lot of demand (not least for me) for Perl to call a special method on exception objects if they're about to kill one's process, as opposed to be caught and handled.
The conference highlight for me were the people. They were very warm, very willing to participate, and heckled only about things I could easily answer. I suspect that means they were also being very nice, since they were a very knowledgeable audience. Privately, everyone was extremely accommodating. Apparently travelling from Australia makes one special, and I felt very special indeed from all the attention.
Looks like I'm out of time for this entry already. I've got yet another flight tomorrow, so I'll try to squeeze out some writing on the plane.
Posted: 9th August 2009.
Tags: europe perl portugal travel yapceu2009
Bookmark:
While testing 5.10.1 RC1, I've hit a show-stopper for Padre. Alien::wxWidgets makes use of a get_options "debug" option that collides with the new internal "debug" option in the latest Module::Build that comes with 5.10.1 RC1.
If you must have Padre on 5.10.1 (and lets face it, you must) you can work around this problem by downgrading Module::Build to 0.32.
On Strawberry, you can just run the following on the command line
pip http://search.cpan.org/CPAN/authors/id/E/EW/EWILHELM/Module-Build-0.32.tar.gz
After seeing the goodies in Tim Bunce's OSCON talk on NYTProf 3, I couldn't resist trying it out, even though half the tests fail in the repository at the moment (especially since half his demos inevitably involve Perl::Critic or PPI and I spotted some curiosities in his video).
I've happy to report that because of the increased detail in the area of xsubs and built ins, I was able to fairly quickly find a situation in which excessive calls via overloading to a centralised xsub boolean true was still consuming far more time than I'd expected.
A few defined() wrappers later to speed up the worst offenders, I managed to pick up somewhere in the vicinity of 1-2% speedup on my sample loads (which are pretty similar to the loads of Perl::Critic). And since I've spent far more time than I care to admit optimising PPI already, anything I can measure as an entire percent or higher is a pretty decent win for me.
PPI 1.206 with just these and no functional changes will be uploaded shortly, and I'm hoping to find some time to tackle some other promising-looking avenues relating to unexpectedly expensive regular expressions shortly.
I also ran the profiling on Perl 5.10.1 RC1 and I'm happy to report that the improvements to UNIVERSAL::isa do seem to make it less expensive (by maybe somewhere between 10-20%). This should provide for another percent or two total speed up on typical loads.
The Perl 6 design team met by phone on 08 July 2009. Larry, Allison, Patrick, Jerry, and chromatic attended.
Larry:
use ParcelsPatrick:
Larry:
Patrick:
Larry:
:auths and :versBEGIN and START as statement prefixesdoes Hash usefully and get the same benefit in a single object:: Cursor now leaves the double-colon at the end of prefix packages=$fh as the only thing as a line, now it complains about an illegal POD directiveour instead of my, last weekPatrick:
c:
Allison:
Patrick:
make install Allison:
Patrick:
perl6 binaryAllison:
Patrick:
parrot_config reports compiler settings that won't work on some platforms without conditional makefile generationAllison:
Patrick:
Allison:
Jerry:
Patrick:
c:
does statement, is that okay?Larry:
Patrick:
Larry:
Patrick:
c:
Patrick:
Jerry:
Patrick:
Allison:
Patrick:
Jerry:
Patrick:
Jerry:
Patrick:
Jerry:
Patrick:
I've created a photoset with 126 photos from YAPC::Europe 2009. Enjoy!
If you search for the tag "yapceu2009", you should find more photos from other conference attendees.
Read more of this story at use Perl.
Read more of this story at use Perl.
I’ve had a great week at OSCON. The talks are excellent but the real value is in the relationships formed and renewed in the “hallway track”. I’m honoured and humbled to be able to call many great people my friends.
My talk on Devel::NYTProf seemed to go well. This year I covered not just NYTProf and the new features in v3 (not yet released) but also added a section on how to use NYTProf to optimize your perl code.
Here’s a quick summary, with links to the slides and screen-cast, and outline what’s still to be done before v3 gets released (getting closer by the day).The first two-thirds of the talk covered general profiling topics and an overview of NYTProf v3. This was an updated and compressed version of the talk I gave last year.
The final third covered optimization techniques.
I began with stern warning against optimization unless there’s a clear need, followed by a selection of quotes, among which is the well known classic from Donald Knuth: “premature optimization is the root of all evil”. What few people remember is that that’s just a fragment of a more balanced statement:
“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.”
After getting an implicit nod from Donald I quoted Rob Pike on the importance of measurement:
“Bottlenecks occur in surprising places, so don’t try to second guess and put in a speed hack until you have proven that’s where the bottleneck is.”
Then I outlined the sequence of steps I usually take when performance profiling and optimizing code. These were grouped into three phases: “Low Hanging Fruit”, “Deeper Changes”, and “Structural Changes”.
I’ve generated a PDF of the slides and made it available on slideshare.net. I’ve also uploaded the 45 minute screencast to my blip.tv channel.
Let me know if you find them useful.
In a recent post Dave Rolsky points out some of his own best practices for Catalyst. While I’ve come around to agree with his first point, that Catalyst should generate a reusable config class for the user, I’m not so sure with regards to his model viewpoints.
For me, the strength of Catalyst’s Model layer has always been the immense glue layer that allows me to configure any model in a predictable way. Much like DBI gives me a unified way to talk to databases, the Catalyst model-layer gives us a unified way of configuring models. In addition, we are able to provide helpers to create these models from the command line, reducing the work needed to set up a new model.
I will grant Dave that the API for the model adapters could be better. It is my hope that we will accomplish as Catalyst takes advantage of more of the new stack that Moose provides. In particular I am excited about the work Devin Austin is doing for GSOC on improving the -Devel package. This is an area where we can significally improve without too much worry about backwards compability. For instance, the KiokuDB model already uses moose accessors for config.
Dave’s example only limits itself to talking to a SQL store via DBIx::Class. However, real world applications typically have several models. In some of my apps I talk to S3, or Queue servers, or LDAP stores for user management. This shows the true power of the Catalyst model layer.
I also hope that we can provide a different way to tie models to controllers. This should be part of the moosify branch of Catalyst-Runtime. I disagree that $schema->resultset(’Person’) is a significant improvement on $c->model(’DBIC::Person’). Controller code is not meant to run without a context anyways.
I hope that by looking at Moose Extensions, we will be able to find a more suitable API for this functionality. Of course, we have a lot of work ahead of us, but I am really starting to like what Catalyst has turned into.
I love Moose. It's a postmodern object system for Perl 5. It's very powerful, saves me writing a lot of code and is very extensible. Dave Rolsky received a grant to write the Moose::Manual and it's a great introduction to Moose, how to use Moose and covers every part of it in detail. I don't really enjoying reading documentation on a screen, so I converted the manual from Pod to LaTeX so that the typography would be beautiful, fixed a few typos in the manual, designed a nice cover and you can now buy a copy for yourself. At the YAPC::Europe 2009 auction a copy of the book signed by Yuval Kogman and Larry Wall went for €120!
So YAPC is pretty much over, and it was awesome. My last minute talks went somewhere between "fine" and "well" depending on the talk. Unlike most other YAPCs I actually caught some of the other sessions too.
I'm really happy with the feedback from the attendees, people are responding better than ever to Moose and even KiokuDB despite the fact that it's such a young project.
This YAPC, more than any other, is reassuring me that the Perl community is moving in the right direction. We seem to be refocusing on what's really important (to me anyway). We are not all the way there, Perl still has problems with marketing, technical debt, flawed perceptions, bad practices and so on, but it's obvious that the trend is one of moving towards where we want to be, not some inevitable doom.
As reported elsewhere the first super easy Padre installer for Windows was released last week.
It is now called the Almost Six package as it contains a development version of Rakudo Perl 6.
Aloms six contains the following packages:
It is very easy to download the msi package and then install it as you would install any other Windows application.
Then you can start writing either Perl 5 or Perl 6 code.
I was on YAPC::EU in the last 3 days but there almost everyone is using Linux or Mac OSX. So I wonder what do you people suggest? How can we attract Windows users who write Perl code to try Padre an give us feedback?
As you might know, Sebastian recently added single-file support to his Mojolicious framework. Viacheslav Tikhanovskii has taken this functionality and run with it, producing bootylicious – a ~280 SLOC web log system. It sorts a directory of POD formatted blog posts, and even supports RSS. I think these single file apps is an excellent way to demonstrate your framework. I actually remember this as being a selling point for Maypole back in the days :)
After the plethora of commits and rush of features into the 0.42 Birthday Edition of Padre, the brief lull afterwards has given me the chance to do some refactoring in less exciting areas.
One of these areas is in the Plugin Manager, which currently has two interpretations of a "Plugin Name" (The one that the plugin tells us to call it, and the wrong one) :)
So I've spent quite a few hours last night digging all the remnants of those old names out (which were used everywhere, and deeply integrated) and just sticking to the class name for the plugin primary internal identifier.
One benefit of this is that the Plugin system can now (theoretically) support plugins outside the Padre::Plugin::FooBar flat namespace (although they won't be auto-detected and there's no way to tell the plugin manager to add it yet).
A secondary benefit of this is that I can now add auto-detection for plugins in the long-awaited Acme::Padre::FooBar namespace.
Padre is started to accumulate enough subsystems now that the idea of doing joke plugins is getting quite realistic.
For example, the SVN-only plugin Padre::Plugin::Media shows a proof of concept for making the editor asynchronously play sound. This Media code, combined with Hook::LexWrap and a File::ShareDir collection of wav files, would make it pretty easy to inject sound effects into arbitrary editor events.
Imagine a Simpsons or South Park themed editor! Or for something less copyright-breaky, perhaps something more like a Acme::Padre::TheCollectedSayingsOfRicardoSignes, with witty and biting interjections whenever the plugins spots you doing something rjbs doesn't like :)
"Oh god, you're using MIME-Tools. Just kill yourself now!"
"You're using SVN? Switch to Git you fool!"
"Oh good, you're not connected to the internet. We need to talk..."
The new Acme:: detection will is available in SVN now, and in the next 0.43 release on the CPAN.
#!/usr/bin/env perl
use Mojolicious::Lite;
get '/' => 'index';
get '/:groovy' => sub {
my $self = shift;
$self->render(text => $self->stash('groovy'));
};
shagadelic;
__DATA__
@@ index.html.epl
% my $self = shift;
% $self->stash(layout => 'funky');
Yea baby!
@@ layouts/funky.html.epl
% my $self = shift;
<!html>
<head><title>Funky!</title></head>
<body>
<%= $self->render_inner %>
</body>
</html>
You will have full access to the Mojolicious script system automatically, so this small app can already run on every backend Mojo supports. (including CGI, FastCGI and the built in HTTP 1.1 servers)
% ./myapp.pl daemon Server available at http://127.0.0.1:3000.If you've been paying attention to web framework developments recently, you will have noticed much similarity to sinatra.rb, a Ruby framework that got a lot of buzz for it's simplicity. So after Viacheslav Tikhanovskii proofed that it was possible to build Mojolicious::Lite on top of Mojolicious in about 40 lines of code, it was really a no brainer to make it a part of the core. For me the best of all is that we can now provide newcomers with a subset of real Mojolicious features, so they can start out very quickly and have their apps later grow to full Mojolicious applications with minimal effort.
The Perl 6 design team met by phone on 01 July 2009. Larry, Allison, Patrick, Jerry, and chromatic attended.
Larry:
use and require into primitivesour scoping rather than my scoping:D and :U modifiers on types for defined and undefineduse statementsTOP instead of comp_unit PROCESS namespaceSETTING namespace from your core namespaceTest and Debug packages are now on their ownuse a setting file directlyuse doesCORE, GLOBAL, and UNIT settings& variants of the typessnaptest target to STD testing so that I can test a snapshot of the compiler against the entire t/ directoryAllison:
Patrick:
BEGIN blocks-G I commitJerry:
c:
Patrick:
&Type in the symbol table, what will that do?Larry:
&Int, it just finds the typeInt Patrick:
&Int?Larry:
Int Patrick:
Larry:
c:
Larry:
Patrick:
Larry:
Patrick:
my $a = (1, 2, 3); Larry:
minmax at this pointPatrick:
Larry:
Patrick:
Jerry:
Patrick:
.minmax method available on rangesLarry:
Patrick:
Larry:
Patrick:
reduce form that worked wellLarry:
Patrick:
Larry:
? quantifier gives me an array, but I just want to know if it's there!"+ mean?" is very appropriatePatrick:
In a perfect world, I would never blog about version numbers in Perl.
Version numbers should be boring. No one should have to think about how they specify or check a version number. Perl programming makes things easy, not hard, right?
Unfortunately, version numbers in Perl aren’t boring and easy. Instead, they are complicated and confusing. Every Perl programmer needs to understand at least some of this complexity. Otherwise, you can make life difficult for yourself or others without realizing it.
In this article, I’m going to explain what I think are ‘good practices’ for dealing with version numbers in Perl. I’m going to point out what I think are ‘bad practices’.1 In the end, I’m going to make some recommendations that I hope people will follow for their own sake and the sake of others who work with their code.
For the impatient, the disinterested or those who just want to follow a recipe, my advice for all modules is this:
our $VERSION = "0.001"; # or "0.001_001" for a dev release $VERSION = eval $VERSION;
If you already use decimal versions that aren’t in the 3-digit decimal format, that’s fine, too. It’s not worth switching for code you’ve already published.
If you disagree with me or would like to see alternatives or want to know the ugly details behind this recommendation, then the rest of this article is for you.
It will be helpful to review at least a little history, then we can ignore how we got here and just focus on what to do. There are three big historical accidents worth noting:2
Accident #1: Perl expects a module to put its version number in a package variable called $VERSION
The first accident means that a module version number can be anything that can be represented by a Perl scalar: a number, a string, an object, etc. There are no constraints, only conventions. This is a blessing and a curse; it’s the main reason why version numbers in Perl aren’t boring.
Accident #2: The Perl 5.6 interpreter changed from a ‘decimal’ version number (5.005_03) to a ‘dotted-integer’ version number (5.6.0)
The second accident got people thinking that ‘version numbers’ for modules shouldn’t be decimals either. It also introduced a three-digit convention for translating between the two approaches: 5.6.2 is equivalent to 5.006002 and 5.005_03 is equivalent to 5.5.30. Finally, it introduced new syntax for ‘v-string’ literals of the form ‘v1.2.3.4′ to help represent dotted-integer version numbers.
A third historical accident was actually an attempt to fix the second:
Accident #3: The Perl 5.10 interpreter made $^V a version object instead of a v-string
Version objects are an attempt to overcome numerous limitations and challenges using v-strings for module version numbers. The UNIVERSAL::VERSION() method also changed, to better accomodate version objects and v-strings.
Here is a very contrived example of the change to VERSION(), where 120, 100, 103 are just the ASCII codes of the letters of my IRC nickname:
$ perl5.8.8 -e '$VERSION=v120.100.103; print main->VERSION' xdg $ perl5.10.0 -e '$VERSION=v120.100.103; print main->VERSION' v120.100.103
The version.pm module provides an interface to version objects and, for older versions of Perl, overrides UNIVERSAL::VERSION() globally to match the behavior of Perl 5.10. The version.pm module is one more thing that keeps version numbers from being as boring as I’d like.
$ perl5.8.8 -Mversion -e '$VERSION=v120.100.103; print main->VERSION' v120.100.103
Think about the implications of that for a module called ‘Foo’. Even if Foo doesn’t use version.pm, if version.pm is loaded anywhere then Foo->VERSION acts differently and gives a different value than is in $Foo::VERSION.
For the most part, version numbers are used to answer just two questions:
Neither question is one that we, as people, particularly want to waste our time on if a computer can answer the questions for us. That means that version numbers need to be specified in a way that makes it easy for a computer to answer these questions.
Recommendation: Never use alphabetical characters in a version
Don’t use ‘1.20alpha’ or ‘2.34beta2′ or ‘2.00R3′. There are no standard conventions for alphanumerics, and you just make life hard for the machines, which means no help for the humans.
After following the first recommendation and excluding alphabetical characters for module versions, we are left with two numeric conventions, just like Perl itself:
A decimal version is just what it seems: an ordinary number that is either an integer (12) or decimal fraction (12.345). Decimal versions are compared in the ordinary, numerical way.
Dotted-integer versions are vectors of (positive) integers, so called because they are usually shown separated by full-stop characters (12.3.45). Dotted-integer versions are compared by pairwise numerical comparison of the first element of each version, then the second, the third, and so on.
There are two ways to represent dotted-integer versions in Perl:
I already described v-strings, but you can read more about them in the perldata documentation page. Despite what it says, v-strings are not deprecated. (This warning will be removed in a future version of Perl.)
Version objects are created using the version.pm module. There are several ways to do this. Here is one example (and it must be kept on one line):
use version; our $VERSION = version->new("v1.2.3");
The version.pm documentation has been updated in 0.77 to better explain the options and potential pitfalls creating version objects, so I won’t repeat them here.
John Peacock, the author of version.pm, recently suggested that it is probably enough to use version.pm and give a quoted $VERSION string without the need to explictly construct an object, as UNIVERSAL::VERSION will convert it to a version object internally anyway.3
use version; our $VERSION = "v1.2.3";
This approach is new and untested, but may eventually be regarded as the best way to use version.pm.
Perl decided upon on a three-digit convention to convert between the older decimal style and the newer dotted-integer style so that older Perls could give a useful error message when checking the Perl version number.
# on Perls before 5.6.0 use 5.6.2; # syntax error use 5.006002; # Perl version error
The three-digit convention takes each dotted-integer component after the first, pads them with leading zeroes if less than 3 digits, and concatenates them together. To go the other way, the fraction portion of a decimal version is padded with trailing zeroes until the number of characters is a multiple of three, then each group of three digits is turned into an integer.
This convention has a subtle complexity: note that the conversion does not round-trip if a dotted-integer element is greater than 999. This might happen if a version number were constructed from a VCS revision number or a timestamp.
v1.2.34567 -> 1.00234567 1.00234567 -> v1.2.345.670
The introduction of the three-digit convention confused people in two ways: how to convert modules from decimal versions to dotted-integers and how to specify version number limits to use().
Many CPAN modules don’t use three-digit decimals, partly because most module boilerplate tools create modules with a two-digit decimal starting version:
$VERSION = "0.01";
These modules usually increment their versions as 0.02, 0.03 and so on. Some authors have been surprised trying to convert their decimal versions into dotted-integers and running afoul of the three-digit convention. In the following example, a naive conversion results in a number that is less than the previous one:
0.01 0.02 0.03 v0.4.0 # WRONG: this is 0.004 v0.40.0 # RIGHT: this is 0.040
The other area of confusion is providing a version number requirement to the use() keyword. With the introduction of v-strings, but prior to Perl 5.10.0, use() internally converts “use Foo v1.2.3″ to “use Foo 1.002003″. So this works as expected:
# in Foo.pm our $VERSION = "0.001002"; # in foo.pl use Foo v0.1.2; # WORKS
However, since Perl recommends specifying “use v5.6.0″ as “use 5.006″, some people think the same should apply to loading modules with use(). But, prior to Perl 5.10.0 (and the change to UNIVERSAL::VERSION), the inverse case might not work at all! Consider this example:
# in Foo.pm our $VERSION = v0.1.2; # in foo.pl use Foo 0.001002;
On a Perl compiled with support for long doubles, the extra precision in converting with the three-digit convention causes the comparison to fail with this incredibly confusing error message:
$ perl-5.8.9-64bit -e 'use Foo 0.001002' Foo version 0.001002 required--this is only version 0.001002
Recommendation: always use() in the same form as $VERSION in a module
When $VERSION is a v-string, it should only be requested as a v-string (”use Foo v0.1.2″), except on Perl 5.10 or when using version.pm. There is a corollary:
Recommendation: don’t switch version number schemes for a published module
If you do switch, then users won’t know in advance the right format to request.
Distributions on CPAN also have version numbers. These are specified as part of the filename. (See a prior article for a formal definition of modules and distributions.)
DAGOLDEN/File-Marker-0.13.tar.gz
Recommendation: Set the distribution version from a module
Usually, distribution versions are set automatically from a primary module within the distribution. This is good — we let the machine take care of it for us. It’s one less thing for us to do, and helps to ensure machine-friendly distribution versions. That’s good for binary packages like .rpm and .deb.
For example, using Module::Build, the ‘module_name’ parameter in Build.PL specifies the name of a module to examine for the distribution name and version.
use Module::Build; Module::Build->new( module_name => 'Foo::Bar', license => 'perl', )->create_build_script;
Or, using Module::Install, you can do the same thing with a Makefile.PL like this:
use inc::Module::Install; name 'Foo-Bar'; all_from 'lib/Foo/Bar.pm'; WriteAll;
It has been a long-standing CPAN convention that distribution version numbers containing an underscore are ‘development’ or ‘alpha’ versions, and the corresponding distribution files do not get indexed by PAUSE as a ‘release’ version.
DAGOLDEN/Test-Reporter-1.53_03.tar.gz
The three main build tools, ExtUtils::MakeMaker, Module::Build and
Module::Install will all attempt to parse a version number for a
distribution from a primary module file using the MM->parse_version() function provided by ExtUtils::MakeMaker. The parse_version() method looks for the first line in the file that appears to set $VERSION, and then calls eval() on that entire line.
The following sections show how to specify an alpha version in each of the three version number styles and some things to consider for each.
our $VERSION = "0.001_001; $VERSION = eval $VERSION;
For a decimal alpha version, the definition is split into two parts. The
first part provides the version in quotes, which is what gets returned by
MM->parse_version(). It has to be in quotes so that the underscore is preserved in the eval() call within parse_version(). Without quotes, parse_version() returns it as an ordinary decimal.
our $VERSION = 0.001_001; # WRONG: parse_version() gives 0.001001
The second line is required to make $VERSION an ordinary number at runtime. Without it, Perl would convert $VERSION to a number by truncating at the underscore, resulting in the wrong version number.4
our $VERSION = v0.1_1;
For some versions of Perl, it’s possible to specify an ‘alpha v-string’, with a final decimal point replaced with an underscore.5 Consider a ‘Foo module with a $VERSION line like the one above. Here’s how different versions of Perl handle a request for a higher version number:
$ perl5.10.0 -e 'use Foo v0.1.2' Foo version v0.1.2 required--this is only version v0.1_1 at -e line 1. BEGIN failed--compilation aborted at -e line 1. $ perl5.8.9 -e 'use Foo v0.1.2' $ perl5.8.0 -e 'use Foo v0.1.2' Foo v0.1.2 required--this is only v0.1.1 at -e line 1. BEGIN failed--compilation aborted at -e line 1. $ perl5.6.2 -e 'use Foo v0.1.2'
Note how Perl 5.8.9 and Perl 5.6.2 both succeed, even though a higher version is requested. One potential solution is to require the version.pm module. This ‘fixes’ 5.8.9, but fails in a different way for 5.6.2.
perl5.8.9 -Mversion -e 'use Foo v0.1.2' Foo version v0.1.2 required--this is only version v0.1_1 at -e line 1. BEGIN failed--compilation aborted at -e line 1. perl5.6.2 -Mversion -e 'use Foo v0.1.2' Foo version v0.1.2 required--this is only version v0.0.0 at -e line 1. BEGIN failed--compilation aborted at -e line 1.
use version; our $VERSION = version->new("v0.1_1");
This form is visually similar to an alpha v-string, but by putting it in quotes and by passing it the version object constructor, the version is protected from idiosyncracies of different versions of the Perl interpreter.
On the surface, it seems like this is a good approach, but it still has problems if one tries to use() the same version on 5.8.0 and 5.6.2.
$ perl5.8.0 -e 'use Foo v0.1_1' Foo version v0.1.1 required--this is only version v0.1_1 at -e line 1. BEGIN failed--compilation aborted at -e line 1. $ perl5.6.2 -e 'use Foo v0.1_1' Foo version 0.011 required--this is only version v0.1_1 at -e line 1. BEGIN failed--compilation aborted at -e line 1.
The counter-examples in the previous sections assume a situation where an alpha version of a module is a prerequisite and show how an alpha v-string argument to use() means different things depending on the version of Perl and whether version.pm has been loaded.
The other alternative is to specify the version as a decimal.
use Foo 0.001_001;
While this is in the form of a decimal alpha version, the Perl parser sees this numeric literal as 0.001001. This works perfectly with the decimal alpha version example above that has the “$VERSION = eval $VERSION” line. Both are just numbers and they compare as equal.
That isn’t the case for version objects. An alpha version object is not equal to a non-alpha version object, even if their numeric components are the same. If Foo has this version line:
use version; our $VERSION = version->new("v0.1_1");
then “use Foo 0.001_001″ fails, even on a recent Perl.
$ perl5.10.0 -e 'use Foo 0.001_001' Foo version 0.001001 required--this is only version v0.1_1
Recommendation: Don’t use v-strings or version objects as alpha versions
There are just too many ways for alpha v-strings and version objects to be used incorrectly. Even if requiring Perl 5.10 or with version.pm, which a module author can control, if a user follows the three-digit convention and uses a decimal version in the call to use(), it can fail.
There are several other ways in which version numbers refuse to be boring. Each could be an entire mini-essay, so I will only highlight a few of the issues I’ve found:
I think version numbers should be boring. I don’t want to have to think about how to write them and I don’t want to make users think about how to format a version prerequisite. And I don’t want to get bug reports about it when a user gets it wrong.
For me, a ‘boring’ version number must:
Given these criteria, my recommendation is to use decimal version numbers, put them in quotes, and add a string eval:
our $VERSION = "0.001"; $VERSION = eval $VERSION;
This is safe and effective and always works. By putting any $VERSION in quotes, even if it isn’t an alpha, you don’t have to remember to add them if you ever change to an alpha version. (And numbers with trailing zeroes are nicely formatted when parsed for distribution versions.)
If you really want to have a dotted-integer module version, then I strongly recommend that you limit your module to Perl 5.10 (or require version.pm and at least Perl 5.8.1) and that you never use an alpha version number. Always quote your dotted integer version when you define it and always use a leading-v to guide your users towards proper usage.
It’s unfortunate that version numbers are so complicated in Perl, but if you follow the recommendations in this article, your version numbers will be as boring as possible. And if you’ve read this all the way to the end, I hope I’ve convinced you that ‘boring’ version numbers are exactly what you want.
I would like to thank Curtis Poe, Ricardo Signes and Eric Wilhelm for reviewing a draft of this article and providing excellent suggestions for improving clarity and correctness. I would also like to thank the particpants in the #toolchain and #corehackers channels on IRC for being a sounding board as the article was developed. Thanks also go to John Peacock for his graceful acceptance of my criticisms and suggestions about version.pm.
I wrote a response, explaining where Dijkstra was coming from, and I am very happy with how it came out, so I'm reposting it here.
The list subscriber said, in part:
On a side note, I never read anything by Dijkstra that wasn't noticeably out of touch with the reality of programming, which qualifies them as screeds to me.A lot of people bring up the premature-loop-exit prohibition without understanding why Dijkstra suggested it; it wasn't just that he was a tightassed Dutchman.And I say that as a former Pascal programmer, and as one who has read, and bought into, things like Kernighan's "Why Pascal is Not My Favorite Programming Language" and the valid rants about how some form of breaking out of a loop without having to proceed to the end is very useful, without destroying structure (except by Dijkstra's definition of structure)...
Dijkstra's idea was this: suppose you want to prove, mathematically, that your program does what it is supposed to do. Please, everyone, suspend your judgment of this issue for a few paragraphs, and bear with me. Let's really suppose that we want to do this.
Dijkstra's idea is that the program is essentially a concatenation of blocks, each of which is trying to accomplish something or other, and each of which does not make sense to run unless some part of the program state is set up for it ahead of time. For example, the program might be to print a sorted list of links from a web page. Then the obvious blocks are:
We say that the "precondition" for C is that the array be populated with URLs, and the "postcondition" is that the array be in sorted order. What you would want to prove about C is that if the precondition holds—that is, if the array is properly populated before C begins—then the postcondition will hold too—that is, the array will be in sorted order when C completes.
It occurs to me that calling this a "proof" is probably biasing everyone's thinking. Let's forget about mathematical proofs and just think about ordinary programmers trying to understand if the program is correct. If the intern in the next cubicle handed you his code for this program, and you were looking it over, you would probably think in very much this way: you would identify block C (maybe it's a subroutine, or maybe not) and then you would try to understand if C, given an array of URLs, would produce a properly sorted array by the time it was done.
C itself might depend on some sub-blocks or subroutines that performed sub-parts of the task; you could try to understand them similarly.
Having proved (or convinced yourself) that C will produce the postcondition "array contains sorted list of URLs", you are in an excellent position to prove (or convince yourself) that block D prints out a sorted array of URLs, which is what you want. Without that belief about C, you are building on sand; you have almost nothing to go on, and you can conclude hardly anything useful about the behavior of D.
Now consider a more complex block, one of the form:
if (q) { E; }
else { F; }
Suppose you believe that code E, given precondition x, is
guaranteed to produce postcondition y. And suppose you believe
the same thing about F. Then you can conclude the same thing about
the entire if-else block: if x was true before it began
executing, then y will be true when it is done.[2] So you can build up proofs (or beliefs)
about small bits of code into proofs (or beliefs) about larger ones.We can understand while loops similarly. Suppose we know that condition p is true prior to the commencement of some loop, and that if p is true before G executes, then p will also be true when G finishes. Then what can we say about this loop?
while (q) { G; }
We can conclude that if p was true before the loop began, then p will
still be true, and q will be false, when the loop ends.BUT BUT BUT BUT if your language has break, then that guarantee goes out the window and you can conclude nothing. Or at the very least your conclusions will become much more difficult. You can no longer treat G atomically; you have to understand its contents in detail.
So this is where Dijkstra is coming from: features like break[3] tend to sabotage the benefits of structured programming, and prevent the programmer from understanding the program as a composition of independent units. The other subscriber made a seemingly disparaging reference to "Dijkstra's idea of structure", but I hope it is clear that it was not an arbitrary idea. Dijkstra's idea of structure is what will allow you to understand a large program as a collection of modules.
Regardless of your opinion about formal verification methods, or correctness proofs, or the practicality of omitting break from your language, it should at least be clear that Dijkstra was not being doctrinaire just for the sake of doctrine.
An archive of Dijkstra's miscellaneous notes and essays (a pre-blogging blog, if you like) is maintained at the University of Texas. I recommend it.
Actually, your job is slightly easier. Let's write this:if (q) { E; } else { F; }Suppose you believe that code E, given precondition x, is guaranteed to produce postcondition y. And suppose you believe the same thing about F. Then you can conclude the same thing about the entire if-else block.
[x] E [y]to mean that code E, given precondition x, produces postcondition y. That is, if we know that x is true when E begins execution, then we know that y is true when E finishes. Then my quoted paragraph above says that from these:
[x] E [y]we can conclude this:
[x] F [y]
[x] if (q) {E} else {F} [y]But actually we can make a somewhat stronger statement. We can make the same conclusion from weaker assumptions. If we believe these:
[x and q] E [y]then we can conclude this:
[x and not q] F [y]
[x] if (q) {E} else {F} [y]In fact this precisely expresses the complete semantics of the if-else construction. Why do we use if-else blocks anyway? This is the reason: we want to be able to write code to guarantee something like this:
[x] BLAH [y]but we only know how to guarantee
[x and q] FOO [y]and
[x and not q] BAR [y]for some q. So we write two blocks of code, each of which accomplishes y under some circumstances, and use if-else to make sure that the right one is selected under the right circumstances.
Hoare logic was invented by Tony Hoare, who also invented the Quicksort algorithm.
|
Order A Discipline of Programming ![]() with kickback no kickback |
I have an idea that the Dijkstra book might be easier to follow after having read this introduction than without it.
|
Order Software Tools in Pascal ![]() with kickback no kickback |
Last week we took Iris to the pediatrician for a checkup, during which
they weighed, measured, and inoculated her. The measuring device,
which I later learned is called a stadiometer, had a bracket on a slider that went up and down on a
post. Iris stood against the post and the nurse adjusted the bracket
to exactly the top of her head. Then she read off Iris's height from
an attached display.How did the bracket know exactly what height to report? This was done in a way I hadn't seen before. It had a photosensor looking at the post, which was printed with this pattern:
(Click to view the other pictures I took of the post.)
The pattern is binary numerals. Each numeral is a certain fraction of a centimeter high, say 1/4 centimeter. If the sensor reads the number 433, that means that the bracket is 433/4 = 108.25 cm off the ground, and so that Iris is 107.75 cm tall.
The patterned strip in the left margin of this article is a straightforward translation of binary numerals to black and white boxes, with black representing 1 and white representing 0:
0000000000If you are paying attention, you will notice that although the strip at left is similar to the pattern in the doctor's office, it is not the same. That is because the numbers on the post are Gray-coded.
0000000001
0000000010
0000000011
0000000100
0000000101
0000000101
...
1111101000
1111101001
...
1111111111
Gray codes solve the following problem with raw binary numbers. Suppose Iris is close to 104 = 416/4 cm tall, so that the photosensor is in the following region of the post:
...But suppose that the sensor (or the post) is slightly mis-aligned, so that instead of properly reading the (416) row, it reads the first half of the (416) row and last half of the (415) row. That makes 0110111111, which is 447 = 111.75 cm, an error of almost 7.5%. (That's three inches, for my American and Burmese readers.) Or the error could go the other way: if the sensor reads the first half of the (415) and the second half of the (416) row, it will see 0110000000 = 384 = 96 cm.
0110100001 (417)
0110100000 (416)
0110011111 (415)
0110011110 (414)
...
Gray code is a method for encoding numbers in binary so that each numeral differs from the adjacent ones in only one position:
0000000000This is the pattern from the post, which you can also see at the right of this article.
0000000001
0000000011
0000000010
0000000110
0000000111
0000000101
0000000100
0000001100
...
1000011100
1000011101
...
1000000000
Now suppose that the mis-aligned sensor reads part of the (416) line and part of the (417) line. With ordinary binary coding, this could result in an error of up to 7.75 cm. (And worse errors for children of other heights.) But with Gray coding no error results from the misreading:
...No matter what parts of 0101110000 and 0101110001 are stitched together, the result is always either 416 or 417.
0101110000 (417)
0101010000 (416)
0101010001 (415)
0101010011 (414)
...
Converting from Gray code to standard binary is easy: take the binary expansion, and invert every bit that is immediately to the right of a 1 bit. For example, in 1111101000, each red bit is to the right of a 1, and so is inverted to obtain the Gray code 1000011100.
Converting back is also easy: of the Gray code. Replace every sequence of the form 1000...01 with 1111...10; also replace 1000... with 1111... if it appears at the end of the code. For example, Gray code 1000011100 contains two such sequences, 100001 and 11, which are replaced with 111110 and 10, to give 1111101000.
This is obviously nonsense, because suppose the post office employs half a million letter carriers. (The actual number is actually about half that, but we are doing a back-of-the-envelope estimate of plausibility.) Then the bite rate is six bites per thousand letter carriers per year, and if children are 900 times more likely to be bitten, they are getting bitten at a rate of 5,400 bites per thousand children per year, or 5.4 bites per child. Insert your own joke here, or use the prefabricated joke framework in the title of this article.
I wrote to the reporter, who attributed the claim to the Postal Bulletin 22258 of 7 May 2009. It does indeed appear there. I am trying to track down the ultimate source, but I suspect I will not get any farther. I have discovered that the "900 times" figure appears in the Post Office's annual announcements of Dog Bite Prevention Month as far back as 2004, but not as far back as 2002.
Meantime, what are the correct numbers?
The Centers for Disease Control and Prevention have a superb on-line database of injury data. It immediately delivers the correct numbers for dog bite rate among children:
| Age | Number of injuries | Population | Rate per 100,000 |
|---|---|---|---|
| 0 | 2,302 | 4,257,020 | 54.08 |
| 1 | 7,100 | 4,182,171 | 169.77 |
| 2 | 10,049 | 4,110,458 | 244.47 |
| 3 | 10,355 | 4,111,354 | 251.86 |
| 4 | 9,920 | 4,063,122 | 244.15 |
| 5 | 7,915 | 4,031,709 | 196.32 |
| 6 | 8,829 | 4,089,126 | 215.91 |
| 7 | 6,404 | 3,935,663 | 162.72 |
| 8 | 8,464 | 3,891,755 | 217.48 |
| 9 | 8,090 | 3,901,375 | 207.36 |
| 10 | 7,388 | 3,927,298 | 188.11 |
| 11 | 6,501 | 4,010,171 | 162.11 |
| 12 | 7,640 | 4,074,587 | 187.49 |
| 13 | 5,876 | 4,108,962 | 142.99 |
| 14 | 4,720 | 4,193,291 | 112.56 |
| 15 | 5,477 | 4,264,883 | 128.42 |
| 16 | 4,379 | 4,334,265 | 101.03 |
| 17 | 4,459 | 4,414,523 | 101.01 |
| Total | 133,560 | 82,361,752 | 162.16 |
According to the USPS 2008 Annual Report, in 2008 the USPS employed 211,661 city delivery carriers and 68,900 full-time rural delivery carriers, a total of 280,561. Since these 280,561 carriers received 3,000 dog bites, the rate per 100,000 carriers per year is 1069.29 bites.
So the correct statistic is not that children are 900 times more likely than carriers to be bitten, but rather that carriers are 6.6 times as likely as children to be bitten, 5.6 times if you consider only children under 13. Incidentally, your toddler's chance of being bitten in the course of a year is only about a quarter of a percent, ceteris paribus.
Where did 900 come from? I have no idea.
There are 293 times as many children as there are letter carriers, and they received a total of 44.5 times as many bites. The "900" figure is all over the Internet, despite being utterly wrong. Even with extensive searching, I was not able to find this factoid in the brochures or reports of any other reputable organization, including the American Veterinary Medical Association, the American Academy of Pediatrics, the Centers for Disease Control and Prevention, or the Humane Society of the Uniited States. It appears to be the invention of the USPS.
Also in the same newspaper, the new Indian restaurant on Baltimore avenue was advertising that they "specialize in vegetarian and non-vegetarian food". It's just a cornucopia of stupidity today, isn't it?
There is a standard example in linguistics that is attached to the word "flimp". The idea it labels is that certain grammatical operations are restricted in the way they behave, and cannot reach deeply into grammatical structures and rearrange them.
For instance, you can ask "What did you use to see the girl on the hill in the blue dress?" and I can reply "I used a telescope to see the girl on the hill in the blue dress". Here "the girl on the hill in the blue dress" is operating as a single component, which could, in principle, be arbitrarily long. ("The girl on the hill that was fought over in the war between the two countries that have been at war since the time your mother saw that monkey climb the steeple of the church...") This component can be extracted whole from one sentence and made the object of a new sentence, or the subject of some other sentence.
But certain other structures are not transportable. For example, in "Bill left all his money to Fred and someone", one can reach down as far as "Fred and someone" and ask "What did Bill leave to Fred and someone?" but one cannot reach all the way down to "someone" and ask "Who did Bill leave all his money to Fred and"?
Under certain linguistic theories of syntax, analogous constraints rule out the existence of certain words. "Flimped" is the hypothetical nonexistent word which, under these theories, cannot exist. To flimp is to kiss a girl who is allergic to. For example, to flimp coconuts is to kiss a girl who is allergic to coconuts. (The grammatical failure in the last sentence but one illustrates the syntactic problem that supposedly rules out the word "flimped".
I am not making this up; for more details (from someone who, unlike me, may know what he is talking about) See Word meaning and Montague grammar by David Dowty, p. 236. Dowty cites the earlier sources, from 1969–1973 who proposed this theory in the first place. The "flimped" example above is exactly the same as Dowty's, and I believe it is the standard one.
Dowty provides a similar, but different example: there is not, and under this theory there cannot be, a verb "to thork" which means "to lend your uncle and", so that "John thorked Harry ten dollars" would mean "John lent his uncle and Harry ten dollars".
I had these examples knocking around in my head for many years. I used to work for the University of Pennsylvania Computer and Information Sciences department, and from my frequent contacts with various cognitive-science types I acquired a lot of odds and ends of linguistic and computational folklore. Michael Niv told me this one sometime around 1992.
The "flimp" thing rattled around my head, surfacing every few months or so, until last week, when I thought of a counterexample: Wank.
The verb "to wank to" means "to rub one's genitals while considering", and so seems to provide a countexample to the theory that says that verbs of this type are illegal in English.
When I went to investigate, I found that the theory had pretty much been refuted anyway. The Dowty book (published 1979) produced another example: "to cuckold" is "to have sexual intercourse with the woman who is married to".
Some Reddit person recently complained that one of my blog posts had no point. Eat this, Reddit person.
However, if same-sex marriages are permitted, there may not be a stable matching, so the character of the problem changes significantly.
A minimal counterexample is:
| A prefers: | B | C | X |
| B prefers: | C | A | X |
| C prefers: | A | B | X |
| X prefers: | A | B | C |
Suppose we match A–B, C–X. Then since B prefers C to A, and C prefers B to X, B and C divorce their mates and marry each other, yielding B–C, A–X.
But now C can improve her situation further by divorcing B in favor of A, who is only too glad to dump the miserable X. The marriages are now A–C, B–X.
B now realizes that his first divorce was a bad idea, since he thought he was trading up from A to C, but has gotten stuck with X instead. So he reconciles with A, who regards the fickle B as superior to her current mate C. The marriages are now A–B, C–X, and we are back where we started, having gone through every possible matching.
This should not be taken as an argument against same-sex marriage. The model fails to generate the following obvious real-world solution: A, B, and C should all move in together and live in joyous tripartite depravity, and X should jump off a bridge.
Two people so far have written to warn me that I would regret this once the space aliens come, and I have to go around undoing all my changes. But even completely leaving aside Wikipedia's "Wikipedia is not a crystal ball" policy, which completely absolves me from having to worry about this eventuality, I think these people have not analyzed the situation correctly. Here is how it seems to me.
Consider these example sentences:
There are four possible outcomes for the future:
In cases (1) and (3), both sentences require revision.
In case (4), neither sentence requires revision.
But in case (2), sentence (a) requires revision, while (b) does not. So my change is a potential improvement in a way I had not appreciated.
Also in last week's article, I said it would be nice to find a case where a Wikipedia article's use of "known to man" actually intended a contrast with divine or feminine knowledge, rather than being a piece of inept blather. I did eventually find such a case: the article on runic alphabet says, in part:
In the Poetic Edda poem Rígþula another origin is related of how the runic alphabet became known to man. The poem relates how Ríg, identified as Heimdall in the introduction, ...
class Persons2 {
Person personA, personB;
Persons2(Person a, Person b) {
personA = a; personB = b;
}
Person getPersonA() { return personA; }
...
}
Java is loathsome in its verbosity, and this sort of monkey code is
Java's verbosity at its most loathsome. So I did not do this.Haskell functions return only one value also, but this is no limitation, because Haskell has product types. And starting in Java 5, the Java type system is a sort of dented, bolted-on version of the type systems that eventually evolved into the Haskell type system. But product types are pretty simple. I can make a generic product type in Java:
class Pair<A,B> {
A a; B b;
Pair(A a, B b) { this.a = a; this.b = b; }
A fst() { return a; }
B snd() { return b; }
}
Then I can declare my function to return a
Pair<Person,Person>:
Pair<Person,Person> findMatch() {
...
return new Pair(husband, wife);
}
Okay, that worked just fine. The
boilerplate is still there, but you only have to do it once. This
trick seems sufficiently useful that I can imagine that I will use it
again, and that someone else reading this will want to use it too.I've been saying for a while that up through version 1.4, Java was a throwback to the languages of the 1970s, but that with the introduction of generics in Java 5, it took a giant step forward into the 1980s. I think this is a point of evidence in favor of that claim.
I wonder why this class isn't in the standard library. I was not the first person to think of doing this; web search turns up several others, who also wonder why this class isn't in the standard library.
I wrote a long, irrelevant coda regarding my use of the identifiers husband and wife in the example, but, contrary to my usual practice, I will publish it another day.
[ Addendum 20090517: Here's the long, irrelevant coda. ]
In the past I have gone on search-and-destroy missions against certain specific phrases, for example "It should be noted that...", which can nearly always be replaced with "" with no loss of meaning. But "known to man" is more fun.
One pleasant property of this phrase is that one can sidestep the issue of whether "man" is gender-neutral. People on both sides of this argument can still agree that "known to man" is best replaced with "known". For example:
As a pleonasm and a cliché, "known to man" is a signpost to prose that has been written by someone who was not thinking about what they were saying, and so one often finds it amid other prose that is pleonastic and clichéd. For example:
Diamond ... is one of the hardest naturally occurring material known (another harder substance known today is the man-made substance aggregated diamond nanorods which is still not the hardest substance known to man).Which I trimmed to say:
Diamond ... is one of the hardest naturally-occurring materials known. (Some artificial substances, such as aggregated diamond nanorods, are harder.)Many people ridicule Strunk and White's fatuous advice to "omit needless words"—if you knew which words were needless, you wouldn't need the advice—but all editors know that beginning writers will use ten words where five will do. The passage above is a good example.
Can "known to man" always be improved by replacement with "known"? I might have said so yesterday, but I mentioned the issue to Yaakov Sloman, who pointed out that the original use was meant to suggest a contrast not with female knowledge but with divine knowledge, an important point that completely escaped my atheist self. In light of this observation, it was easy to come up with a counterexample: "His acts descended to a depth of evil previously unknown to man" partakes of the theological connotations very nicely, I think, and so loses some of its force if it is truncated to "... previously unknown". I suppose that many similar examples appear in the work of H. P. Lovecraft.
It would be nice if some of the Wikipedia examples were of this type, but so far I haven't found any. The only cases so far that I haven't changed are all direct quotations, including several from the introductory narration of The Twilight Zone, which asserts that "There is a fifth dimension beyond that which is known to man...". I like when things turn out better than I expected, but this wasn't one of those times. Instead, there was one example that was even worse than I expected. Bad writing it may be, but the wrongness of "known to man" is at least arguable in most cases. (An argument I don't want to make today, although if I did, I might suggest that "titanium dioxide is the best whitening agent known to man" be rewritten as "titanium dioxide is the best whitening agent known to persons of both sexes with at least nine and a half inches of savage, throbbing cockmeat.") But one of the examples I corrected was risibly inept, in an unusual way:
Wonder Woman's Amazon training also gave her limited telepathy, profound scientific knowledge, and the ability to speak every language known to man.I have difficulty imagining that the training imparted to Diana, crown princess of the exclusively female population of Paradise Island, would be limited to languages known to man.
Earle Martin drew my attention to the Wikipedia article on "The hardest metal known to man". I did not dare to change this.
[ Addendum 20090515: There is a followup article. ]
Line 319 in XML document from class path resource [applicationContext-standalone.xml] is invalid; nested exception is org.xml.sax.SAXParseException: cvc-complex-type.2.3: Element 'beans' cannot have character [children], because the type's content type is element-only.Experienced technicians will of course want to look at line 319. Silly! If looking at line 319 were any help, this would not be this month's lucky winner. Line 319 is the last line of the document, and says, in whole, "</beans>".
What this actually means is that there is a stray plus sign at the end of line 54.
Well, that is the ultimate cause. The Fregean Bedeutung, as it were.
What it really means (the Sinn) is that the <beans>...</beans> element is allowed to contain sub-elements, but not naked text ("content type is element-only") and the stray plus sign is naked text.
The mixture of weird jargon ("cvc-complex-type.2.3") and obscure anaphora ("character [children]" for "plus sign") got this message nominated for the competition. The totally wrong line number is a bonus. But what won this message the prize is that even if you somehow understand what it means, it doesn't help you find the actual problem! You get to grovel over the 319-line XML file line-by-line, looking for the extra character.
Come on, folks, it's a SAX parser, so how hard is it to complain about the plus sign as soon as it shows up?
What do we have for the lucky winner, Johnny?
You'll be flown to lovely Centralia, Pennsylvania, where you'll enjoy four days and three nights of solitude in an abandoned coal mine being flogged with holly branches and CAT-5 ethernet cable by the cast of "The Hills"!Thank you, Johnny. And there is a runner-up! The badblocks utility that is distributed as part of the Linux e2fsprogs package, produces the following extremely useful error message:
% badblocks /home badblocks: invalid starting block (0): must be less than 0Apparently this is Linux-speak for "This program needs the name of a device file, and the programmer was too lazy to have it detect that you supplied the name of the mount point instead".
Happy spring, everyone!
If you've been at YAPC::Europe 2009 (or still are there...), I've probably shown you my still rather simple space invaders clone, which is totally boring besides the fact that it's written in Perl (and based on SDL). It's now on github: http://github.com/domm/Game-PerlInvaders. Please clone and patch!
BTW, if you want to try it out and have problems installing SDL from CPAN, try the most recent version from github, see http://wiki.github.com/cthulhuology/SDLPerl
We had a BOF ("Birds Of a Feather") for marketing Perl yesterday. Dave Cross organized it and I kicked things off. My agenda didn't quite go as planned. I started by laying out the goals:
The BOF didn't follow that. I didn't think we'd get all of these concerns addressed, but when we had 51 people showing up and pulling things into different directions, it became herding cats. I tried unsuccessfully to pull things back on track, but everyone was insisting on explaining the problems they had with Perl and the solutions they saw for the problems. After cutting a few people short to ensure that others got a chance to talk (frustrating because some people who went on at length had excellent lists of issues that they wanted to address), I went with the flow and steered things from there. Here's what came out of it.
We couldn't agree on a clear goal, so I just pulled rank and said "our goal is 'bringing the excitement back to Perl'" and we moved on from there. Then one person (I didn't catch his name, sorry), asked the annoyingly obvious question: how do we measure success? We have several ideas here, but we need more. Job postings, analyzing server logs from perldoc.perl.org (a lovely site, by the way) and other Perl sites, tracking articles and similar things. We don't yet know the most effective things that will be needed here, but we know we need this.
Richard Dice suggested that when companies are worked with (he had a lot of say about the selection of companies), we need to have a neutral way of getting information. Imagine a matrix with the vertical axis listing value drivers ("availability of programmers", "speed of development", "commercial support" and so on) and the horizontal axis listing various technologies and have them "score" their perceptions of these technologies. I've no idea if this is the way we'll go (yes, we know we need to talk to more than just companies), but it's much better than what we previously considered.
There was also general -- not universal -- agreement that we need to have market research and analysis professionals involved. How we pay for this is unknown at this time, but Gabor and others had plenty of ideas on how to make this happen.
Edmund von der Burg, the brains behind send-a-newbie.com, surveyed the newbies for their attitudes. It's not a comprehensive or formal survey, but that information was illuminating nonetheless.
Some of the various issues which were raised (no endorsement, just mentioning issues):
There was quite a bit of information that I learned which I didn't previously know, but perhaps one of the most fascinating things was this: in many countries, Perl does not have a negative reputation because no one's heard of Perl. If that's not an opportunity, I don't know what is. That might be beyond our scope (I really don't know and it would be hard to quantify), but I highly recommend people think about how we could change that.
Obviously there's a lot more than that and we have more work to do, but this is a good start. Some potentials can't be discussed yet (there are some companies which might be willing to provide advice/money/assistance) and others aren't finalized, but we're starting to get things on track.
Today, I converted the ExtUtils::ParseXS repository from subversion to git, published it on github.com and announced it on the perl5-porters and module-build perl mailing lists. Within hours, it was cloned and I got my first pull request.
I don’t know if that’s the start of a trend or not, but it’s food for thought for anyone who questions whether git makes contributing easier.? As a publisher, git means no need to centrally administer repository access.? As a contributor, it means no need to request repository access (and wait for it to be granted).? And for both, it means no need to manage patches outside the VCS.? I’m convinced that making it easier for people to contribute means greater odds of getting contributions.
On the technical side, this was not the easiest conversion and I learned a lot about git history rewriting during the cleanup.? Some of the highlights:
This latter one was a multi-part fix.? Some commit entries said “*** empty log message ***’.? Others were literally empty.?? Worse, the very first commit was the “*** empty …” one.? git rebase is the normal tool for fixing up history, but rebase can’t change the root commit.? I worked around it following these instructions:
The interactive rebase gave me a list of commits and message synopses.? I changed all the “*** empty …” ones from “pick” to “edit”.? For the entirely empty ones, they showed up strangely in the rebase list, where ce86d39 was the empty commit and parent to f4a7235? with the log message shown:
pick ce86d39 >f4a7235 blah blah blah
It turns out the way to rebase this is to just break them apart:
edit ce86d39 pick f4a7235 blah blah blah</pre>
Then, during the rebase run, when it would stop on “*** empty …” commits, I would amend the commit message with a list of files changed (a lazy fix, I’ll admit) and continue:
$ git diff --name-only HEAD^ | \
perl -E 'say "Updated " . join(q{, }, map { chomp; $_} <>)' | \
git commit --amend --file=- && git rebase --continue
Voila — very lazy empty log editing.? In the blank message cases, the rebase commit itself fails due to the empty message, but I could do a direct? “git commit” (not amend because the rebase cherry-pick commit failed), give an appropriate log message, and then “git rebase –continue”.
After that was done, the history was great, but the tags were still on the old messy branch.? But that was easy to fix with a little Perl programming:
This would have been more complex if there had been merges to consider and I’m glad there weren’t.? But the end result was a nice clean, well-tagged git history.
"During YAPC::Europe 2009 in Lisbon, Patrick Michaud, the Rakudo Compiler Pumpking, has just announced that Rakudo Star will be released in Spring 2010."
I'm not sure if Patrick actually used those terms, but I'd just like to remind folks that using seasons to indicate international release schedules is one of the most pointless and annoying things you can do.
It's much better to at least pick a month or a quarter. Summer means December/January, Autumn is meaningless since there's not really any noticeable climatic difference and the trees don't know it's Autumn (since they stay green all year round), Winter usually means mid-late May when the first snow hits the mountains and the temperate drops 10 degrees overnight, and Spring usually means October (some would say September, but until the football finals are over it doesn't really feel like it).
Of course, I'm sure the Rakudo Star release will slip a bit from October. But this does leave us releasing around about Christmas just as we originally said.
|
When you need perl, think perl.org
|
|