yapc: Yet Another Perl Conference

North America 2004

Talk Descriptions

A framework for TSH network trace analysis
Kostas Pentikousis
Trace analysis is the behavioral study of network traffic to reveal patterns, quantify usage, measure feature deployment, and identify protocol shortcomings. An excellent repository of traces collected at several monitoring points is maintained by the National Laboratory for Applied Network Research (NLANR/PMA). Although the repository features a wide variety of traces, it does not provide software to analyze the binary traces. We fill this void in a platform-independent way with Net::Traces::TSH, a pure Perl module, which provides methods to analyze IP packet traces in Time Sequenced Headers (TSH).

Net::Traces::TSH, available from CPAN, can examine TSH traces, collect Meta trace information, measure protocol usage, and quantify IP Fragmentation, Differentiated Services and ECN deployment. Packet and segment size distributions are also generated and stored in comma-separated values (CSV), a platform-independent text format. Furthermore, one can extract all TCP sender transmissions and essential information about each data-carrying TCP segment. The module can convert binary TSH traces to (a) tcpdump text format for visual inspection or presentation in a classroom; and (b) binary Traffic Trace format for input to the network simulator (ns-2).

Our research goal is to develop a "TCP-centered" error model (TEM) based on the measured experience of real world TCP flows, and demonstrate that TEMs provide a more realistic simulation tool than rate-based models for TCP performance evaluation.

Taking advantage of the fast features of Perl, we enjoyed rapid development without having to make costly sacrifices in performance. We were able to focus on the research problem and explore many ideas instead of spending time on implementation. In fact, based on the same code, we expanded our research to quantify the deployment of TCP options using traces from the NLANR/PMA repository. We believe that Net::Traces::TSH can become the standard framework for TSH trace analysis.

Beyond advanced regexes
Abigail
We will show how we can (ab)use regexes to solve problems that are considered "hard". We will solve problems like the 8-queens problem, finding Hamiltonian paths, lot drawings, and more. We will also show that solving Perl regular expressions (even without using any of the "experimental" features) is NP complete by solving the 3CNF-SAT problem using Perl regular expressions.

Building Applications with OpenInteract2
Chris Winters
OpenInteract2 is a full-featured application server that makes it easy to distribute database and platform independent applications. This talk will walk through building an application from beginning to some form of being useful -- applications are never truly done. We will follow the OpenInteract2 package development tutorial fairly closely [1] with illuminating flowcharts in addition to plenty of opportunities for explanation, experimentation and questions. This talk is interactive; we'll be creating an application as we go rather than simply presenting it statically. With additional time we can add advanced functionality to the application -- new tables and objects, join queries, lookup table editing, and application-level security. We can also delve deeper into the design philosophies behind the system.

[1] OpenInteract2::Manual::Tutorial http://search.cpan.org/~cwinters/OpenInteract-1.99_03/lib/OpenInteract2/Manual/Tutorial.pod

Building Scalable Websites with Perl
Perrin Harkins
Did you know that many of the largest websites in the world are built with Perl? Want to know how they do it? This talk will discuss techniques like queuing, caching, and clustering that allow sites like Ticketmaster.com to handle massive amounts of traffic. We'll examine how they work, and how you can put them to work on your own sites.

CPAN modules every perl programmer should know
Mark Fowler
The CPAN is the greatest resource available to any Perl programmer; The only problem is there's so much of it. Where does a Perl programmer start? What are the essentials modules an author needs to know about to get by? And more importantly, which modules do they need to know about to get their work done in half the time?

For the past four years Mark Fowler has spent every Christmas selecting twenty-five of his favourite Perl Modules and writing about them in the Perl Advent Calendar (http://perladvent.org/). In this tutorial he explains a selection of modules designed to make your life easier. These modules include ways to:

Extend the language's syntax
Extract and markup random data
Effectively debug and profile your applications
Create web applications easily and securely
Smart ways to deal with storing things in databases
Sorting and filtering Email
Processing XML quickly and simply
Embedding other languages in Perl
Easily installing your modules.
And More...

Cluster::Run Perl Module
Zachary Zebrowski
This talk will introduce the Cluster::Run perl module, which (will be) available on CPAN. The module was designed as yet another way to perform distributed processing and stress testing across different operating systems. The module allows you to automatically distribute the application (which need not be perl) and data files to various systems, run the application on the various systems, and correlate any output files and statistics automatically. The only requirements are perl, a few modules, and ssh to be installed and set up correctly.

Digital Forensics - Using Perl to Harvest Hash Sets
Douglas White
The NIST National Software Reference Library (www.nsrl.nist.gov) is designed to collect software from various sources and incorporate file profiles computed from this software into a Reference Data Set (RDS) of information. The RDS canbe used by law enforcement, government, and industry organizations to review files on a computer by matching file profiles in the RDS. This will help to efficiently determine which files are important as evidence on computers or file systems that have been comprimised or seized as part of criminal investigations.

This presentation will touch briefly on the concepts of file hashes, and focus on the use of Perl to automate the collection of hashes from various media. The process of applying one algorithm - hashing - to a library of files could be generic enough to be used by other algorithms - code unravelling, decryption, signal processing, etc. Law enforcement, digital archival, corporate security and forensic investigation applications will be discussed.

Does Pair Programming Benefit Perl Programmers?
Tom Legrady
Extreme Programming has had some influence on many programmers, not just those who adopted the paradigm whole-heartedly. Most significantly, unit testing is becoming increasingly wide-spread. But then, testing has been encouraged for decades; the difference now is that XP evangelists have provided frameworks which make it simple to automate testing. Similarly, refactoring has gained some support. Encapsulating in-linecode into a routine or method is not new, but having IDE support for refactoring encourages the process. On the other hand, developing no more than is necessary, refactoring no more than is required, IS somewhat revolutionary. Detractors and doubters fear that such an evolutionary design process invites dead-ends and other hardships.

A more radical aspect of XP is the suggestion that programmers should work in pairs, that four eyes and one keyboard are faster than four eyes and two keyboards. Most newcomers are dubious. Some developers have suggested the experiment to their managers, but managers envisage themselves hauled up on some superior's carpet to explain why developer salaries have increased while productivity has decreased, or why the current crisis project is behind schedule.

The only solution is to speak from experience, to demonstrate that programming in pairs is effective, efficient, and brings benefits to the team. To test pair programming, Toronto Perl Mongers devoted one meeting to an exercise in working in randomly paired teams. Of course, two hours is only half an afternoon, hardly sufficient to generate a significant amount of code; randomly paired teams just beginning to explore pairprogramming will be more struggling with the learning curve than demonstrating the benefits. With the distractions of testing and documentation, figuring out how to do pair programming, and finding commonality between partners with differing skills and experience, will any running code be produced at all?

Enterprise Perl
James A. Duncan
The difference between Perl and Enterprise Perl is not just a name, which is a pity, because, like the next guy, I hate the word Enterprise. Enterprise Perl is about a slightly higher bar in what is required in the area of maintenance, scalability, and reliability. It's about Call Centre operations, vast customer databases, and product management systems. It's slightly different in approach, because the importance of this sort of thing for organisations goes beyond the need to keep one box running. It's about keeping the business in business. This talk exposes these differences, why it means you have to be approach your job differently, and how to structure your Perl code to make it more capable for the job.

Extproc_perl 2.0: Oracle Stored Procedures in Perl
Jeff Horwitz
extproc_perl enables you to write Oracle stored procedures in Perl, providing another database programming language beyond PL/SQL and Java. Some of its supported features include cross-procedure interpreter persistence, callbacks into the database using DBI, automatic conversion between Oracle and Perl datatypes, and of course, the entire CPAN library of Perl modules. This session covers the new 2.0 version of extproc_perl from soup to nuts: a general overview, installation, configuration, usage, callbacks using DBI, troubleshooting, and plenty ofuseful examples that you can try on your own databases. Attendees will leave this session ready to run their first Perl stored procedure in a matter of minutes. You can read more about extproc_perl at http://www.smashing.org/extproc_perl.

Indexing and Retrieving Data with Perl and SWISH-E
Josh Rabinowitz
SWISH-E is a powerful, fast, and flexible system for building and querying indices. Perl's text handling prowess coupled with SWISH-E's special features and perl API provide a substantial and robust system upon which to build searching systems. This presentation builds on Josh's July 2003 Linux Journal article "How To Index Anything." His talk has also been mentioned on the swish-e website.

In this talk we'll cover:

SWISH-E core concepts: MetaNames, Properties, and Indices
How to index HTML files on your filesystem and test indices from the command line
How to convert files to be indexed and how SWISH-E does its indexing
Creating indices using perl and SWISH-E
Using SWISH-E's perl API to search on SWISH-E indices
Auto Properties and more about MetaNames, and Properties,
Sman, a real-world open source project that uses SWISH-E and perl to provide an enhanced version of 'man -k' and 'apropos' (see http://joshr.com/src/sman/, it will likely appear on CPAN before long.)
'swished', a mod_perl-based concurrent and persistent swish-e server written by Josh and to be open sourced around the time of conference. Such a server has long been on the SWISH-E "to do" list.
Current challenges when using SWISH-E with human languages and perl.
Future plans for SWISH-E development, what you can contribute, and why

We'll also compare MySQL4's Full Text Search with SWISH-E indices interms of features and performance, with graphs of the relative responsetime for the two indexing mechanisms when tested under various size indices, queries, and numbers of concurrent searches.

Learning Perl Objects, References, and Modules (Intermediate Perl)
Tad McClellan
A one-day version of Stonehenge's course companion for the Alpaca book.

Lessons Learned from dozens of interviews with Perl developers
Pierre Denis
Perl Developers have distinctive characteristics that make them stand out in the crowd for good or bad reasons. This talk will take the viewpoint of the recruiter and assess where are the plus point and minus points of Perl Developers in IT compared to other groups. This talk is likely to be controversial.

Lightning Talks
Geoff Avery
5 minute talks on all things Perl. Go to the lightning talk sign-up page to submit a talk.

List-Compare
Jim Keenan
Do you find that you often have to compare two or more lists to determine whether a particular element is found in those lists? Do you have occasion to determine relationships among lists such as their intersection, union, difference, symmetric difference or subsets? If so, the Perl module List::Compare, available from CPAN, may be useful to you. List::Compare offers both functional and object-oriented interfaces to well-tested code as well as an accelerated approach to comparing just two lists at a time. This presentation will introduce you to the various functions List::Compare makes available to you.

Mac OS X, Unix, and Perl
Steve Hayman
Every Macintosh ships with Mac OS X and Perl 5.8, but so what? What's this UNIX-based operating system all about, and how does Perl fit in? Apple Consulting Engineer Steve Hayman will review the state of Perl on Mac OS X, and show how it is exactly-almost-kind of-like Perl on other platforms, and demonstrate how scripting on the Mac can be completely "unlike" other platforms.

Magicpoint-EZ: A Pre-processor for Magicpoint Presentations
Tim Maher
Magicpoint (http://www.mew.org/mgp) is a popular open source presentation package with a strange mix of wonderfully advanced and unbearably primitive features. As examples of the latter,there's no practical way to ask for a particular word to be italicized, or for the contents of a specified file to be inserted at an arbitrary position, or even for page numbers to be shown atop each page. Even worse, there's no support for tables or user-defined styles.

Don't get me wrong; I'm very fond of Magicpoint. But if there was ever an application in desperate need of a preprocessor to amplify its ergonomics, this is the "mother of all such applications".

Magicpoint-EZ to the rescue! It converts text files written in an enhanced Magicpoint dialect into standard Magicpoint format,allowing Magicpoint to be used to view sophisticated presentations -- in spite of its primitive support for their construction. As an added bonus, Magicpoint-EZ also makes the development of presentations more efficient, because the developer typically has to type only 40% of the characters that ultimately appear in the resulting Magicpoint file.

Magicpoint-EZ is over one year old, and has been used to create hundreds of impressive slides that have been shown at YAPC, TPC,and Perl mongers meetings -- where attendees have found it hard to believe they were viewing a Magicpoint presentation! This talk will demonstrate the features of Magicpoint-EZ, and describe the pure-Perl code that implements it. Along the way, attendees will learn generally applicable Perl techniques for automating the conversion of one body of text into another. Naturally, the presentation will be developed using Magicpoint-EZ itself, and presented using the standard Magicpoint program.

Mail-Digest-Tools
Jim Keenan
Do you like to follow mailing list discussions -- about Perl or any other topic -- but find that you can't deal with 40 or 50 separate e-mails per day? Do you, as an alternative, subscribe to a 'daily digest' version of a mailing list? Would you like to be able to follow individual discussion 'threads' both within and across mailing list digests? If so, then the Perl module Mail::Digest::Tools, now available on CPAN, provides useful tools for you. This presentation will show you how to manage a local archive of mailing list discussion thread files. Mail::Digest::Tools -- a tool for the truly obsessive!

MediaToys
Lawrence Furnival
The basic technology is a REST type "service" written in perl using a Mac::Glue interface to extract information about media on the server and to perform some limited operations though quicktime. We do this so students can develop their own MediaToys (mostly using PHP) and have access to some underlying services. For my MediaToys, file sequencing is done with smil for quicktime (generated by perl). All the infrastructure for the project is written in perl , i.e. uploads, filemanagement, quota management, mapping of ID's, logins etc are all done in perl.

The ToyRadio.pl generates a smil file that looks like a streaming radio talk show where students to give out our phone number and collect voice mails that are automatically routed to their account. They can then author (cgi-perl -> smil) a series of voice mails together with a jingle and sound effects into a pseudo "call in radio show" using really simple tools (telephone, email, cgi).

Mouth-to-Mouth works quite similarly and uses the same technology, but has the trappings of a debate with a split screen and voice and images. Students get 3 min time slots (main points) where they can string together voicemails from themselves (and their friends). Then they get 1 min slots (rebuttal) and then 30 seconds (summary). Both sides can see the current state of the others talk. (And either side can hit a nuke button and delete everything at any point.) When one side has the mic, the other changes the image on their side of the screen to commenton the style and lack of substance in the other side's arguments (graphic of person sleeping with zzzz's, arrow to horse's ass and other high minded images). Final publication has vote buttons at the bottom for the audience to vote on who wins.

We are also looking at getting visual content from faxes (to our faxserver) and we are running SquidCam (which has video voicemail in quicktime format) on the quicktime server and perhaps by the time of the conference I will have some other new mediaToys using those resources. The basic idea is to use really simple and primitive input formats (telephone, fax) to create simple media chunks that can then be combined (smil) in interesting ways to make fun stuff (using perl, of course!).

Minimal Perl for UNIX People
Tim Maher
Perl is a wonderful language, that offers programmers a rich feature set, huge stylistic and syntactic liberties, and many ways to accomplish the same thing. But for the impatient beginner, these characteristics can translate into "too many complications, too much uncertainty, and too many choices." Although Perl's motto is "There's More Than One Way to Do It", this tutorial will teach students only one way -- the "Easiest Way"!

Students will learn a carefully selected minimal subset of Perl that gives immediate access to some of its powerful capabilities, and serves as a solid foundation for additional learning. Upon completion of the tutorial, students will have the necessary skills to convert files, validate data, generate simple reports, and perform numerical calculations in Perl. They'll also know how to handle command-line "switches" in Perl scripts.

This tutorial is particularly well suited to those having some prior experience with UNIX filter commands, such as grep, sed, and awk, but others can also benefit.

Mod_Parrot: Extending and Embedding parrot
Kevin Falcone
This talk will discuss the improvements made in Parrot's embedding and extending interface. The early version of mod_parrot relied on gross hacks to interface with the Apache API. The latest version ofParrot's NCI (Native Call Interface) allows straight forward integration with 3rd party libraries. mod_parrot is a sample implementation of this technology that shows how parrot's ability to host multiple languages leads to easy access to an API from these languages with a single implementation.

The New Kwiki
Ian Langworth
The new Kwiki is a complete refactoring of the older CGI::Kwiki and boasts a more organized and pluggable codebase as well as greater ease of extendibility and customizability. Brian Ingerson, software lead for Socialtext, has slaved for months to provide the second generation of this popular software. Ian Langworth, a student who has participated in the project, will demonstrate Kwiki's spiffy new architecture and plugin system. If you've been waiting for the next version of Kwiki, it's here at YAPC.

Parsing delimited and balanced strings
Abigail
An indepth beginners tutorial that discusses the aspects of parsing delimited (for instance, double quoted strings) and balanced (for instance, expressions with parenthesis) strings. We will discuss the pro and cons of various approaches, pay attention to efficiency, and consider different kinds of delimited/balanced strings (with/without escapes, delimiters longer than one character, more one possible delimiter).

Perl 6
Damian Conway
The design of Perl 6 has moved ahead significantly in the past twelve months. In this talk, Conway will look at many new refinements to the language, with special emphasis on Perl 6's powerful new approach to object orientation.

Perl Style Guides for Large Projects
Daniel Allen
Perl style guides have been discussed in many forms; however, most discussions do not take into account the different requirements of projects with multiple developers and a large code-base. Large projects definitely need unified standards and conventions to keep their code from becoming a mess. These requirements are necessary for all languages but are particularly important for Perl, which has language features and a culture which encourage MTOWTDI (More Than One Way to Do It). This talk will discuss a few large perl projects and their style guides, propose additional suggestions, and offer recommendations for how to adapt these style guides for one's own projects that one wants to grow larger.

Perl Testing for Large Projects
Andy Lester
Automated testing has received more attention in recent years, especially as one of the cornerstones of XP. Perl's automated testing tools have always been geared towards testing of CPAN modules, but the Perl testing framework is excellent for automating large software projects, too. Lester discusses real-world strategies of testing large projects, with many code examples. You'll learn to test all code in the project, even if it's not Perl, as well as enforcing coding and documentation standards; how to write a website testing robot with WWW::Mechanize and HTML::Lint; how to test database integrity with Test::DatabaseRow; how to write your own domain-specific Test:: module if the dozens on the CPAN don't suit your purposes. It's not all about code. You'll also learn the best practices and culture of automated testing, how to get started automating an existing project, and how writing automated tests can help you write better code in less time.

Perl and Inline Octave
Andy Adler
I use Perl to manage files, and Octave to crunch numbers. Recently, I worked on a project that generated enormous data files, which needed to be processed and then analysed - a perfect task for my two favourite languages. Since I'd just heard a mighty cool talk on Inline, it seemed clear to me that I needed to write Inline::Octave.

Unlike some other Inline languages, such as C or Java, Octave runs as an interpreted environment, and does not natively support sockets or other inter-process communication facilities. The choices were thus: 1) modify octave, or 2) control it from perl by typing into the interpreter, and reading the output. I chose the later, to allow the technique to work with unmodified Octave. In order to do this, Perl provides the IPC::Open3 to control the Stdin, Stdout, and Stderr of a process, however it contains a number of warnings: for example: "This is very dangerous, as you may block forever." Suitably forewarned, I set out on this hazardous enterprise, learning a number of tricks which I will describe.

Perl wizardry without an editor
Andy Lester
Perl's Unix legacy and rich set of "do what I mean" features make it an excellent power tool from the command line. Some of these features are found in Perl's command-line options. My talk explores the tools available to write useful programs from a shell prompt. The emphasis will be on data filtering, in-place editing, and modules aimed at command-line use, such asText::Autoformat. I'll also provide an overview of some of the CPAN modules that use-d and -O options, such as Devel::DProf and B::*.

Preparing a CPAN distribution
Mark Fowler
Making a Perl Module into a CPAN compatible distribution allows you not only to upload such a module to CPAN, but numerous other advantages too, from code testing and dependency listing to easy distribution anddeployment of your code. This talk covers all that someone building a module distribution needs to know. Discussions include the use of each file in a standard distribution, the tools available to help you create and update these files, what actually happens when you install a Perl module, and what tips and tricks can make your life easier.

Regexp::Common
Abigail
In this presentation we will show how to use the Regexp::Common module. We will show many examples of the patterns available in this module. We'll show how Regexp::Common can be used to solve common parsing problems. Finally, we will show how you can use Regexp::Common to create your own library of often used patterns.

Skip lists: an alternative to trees
Robert Rothenberg
Skip lists are an efficient alternative to binary trees invented by William Pugh in 1989. By maintaining multiple links per node, skip lists can match the performance of binary trees without the overhead of keeping the entire structure balanced. We introduce various types of skip list implementations (nondeterministic and deterministic) and compare their performance and memory requirements to various tree implementations. We present the List::SkipList module, and detail some novel optimizations in the implemention. Finally, we cover additional features which make the module highly flexible.

Sort::Maker
Uri Guttman
At the 3rd perl conference (5 years ago!) i co-wrote a paper (which won as the best technical paper) on efficient perl sorting which covered a new technique called the GRT. In that i mentioned that i would be developing a module (Sort::Records) to implement this. i recently have been putting much thought into it and came up with ways to make it much simpler and also more useful. the module will make it easy to create fast sort subs. see the recent threads in the perl6language list for some of the ideas that will go into this module. The module may also become the core of a perl6::sort module if Damian does the perl6 part.

Sufficiently Advanced Technologies
Damian Conway
In module design, interface is everything. Going one step beyond this dictum, Conway demonstrates and explains several practical applications of Clarke's Law ("Any sufficiently advanced technology is indistinguishable from magic") by presenting a series of useful modules whose interface is...nothing.

Test Driven Development (TDD) using Perl (Agile development)
Ruslan (Ted) Kharitonov
Test Driven Development (TDD) improves the quality of the development cycle, reduces the number of bugs, and provides you with detailed design specification for your application. In this presentation you get to learn about TDD and how to use Test::More to write automated tests in Perl.

The Open Source programmer's guide to getting a great job
Andy Lester and Bill Odom
You don't have to live on the west coast to get a great job and still give back to the Open Source community. Unlearn the conventional wisdom that cripples most job hunters. Forget what you think you know about getting a great job. This interactive session is in three sections, with plenty of time for Q&A.

Finding your job. The right job for you is out there. You just have to find it, or have it find you. Learn the right search techniques: Only 1% of jobs are filled through job boards! Learn to grow and use your circle of contacts, including in the OS world, to help in your search. Take home techniques to find out about the company you're interested in: Google is only the start.

Get an interview A pretty resume is worthless if the hiring manager doesn't see what he needs. Using your Open Source background to get in the door. Find out what the manager wants. Learn how to read a job posting. Create a resume and online presence that lets potential employers know about you. Does your online footprint help or hinder you?

Get the job Learn the three crucial mistakes not to make in your interview. Find out how to turn the old-style Q&A interview into a working session to prove that you are the right candidate.

Use WxPerl to Quickly Create "Professional-Looking" Graphical User Interfaces
Sam Skielnik
This talk will introduce the use of WxPerl to quickly create "professional-looking" Graphical User Interfaces (GUI). The talk will attempt to answer the following questions:

What is WxPerl and Why Not Use Perl/Tk?
What major GUI widgets are available? (menus, notebook tabs, spreadsheet grid, etc.)
What are the basic steps to creating a Perl script with a WxPerl GUI?
What free GUI Designer/Layout tool can I use to generate WxPerl scripts?

Using the Perl Debugger
Daniel Allen
This introduction to the Perl debugger is targetted at beginning and intermediate perl programmers.

Preamble: avoiding bugs with 'warnings' and 'strict'
Motivation: What's wrong with print "got here" if ($debug); ?
Invoking the debugger.
First 6 essential commands
Next 5 useful commands: Breakpoints, Actions, Watchpoints
Useful Customizations.

Why mod_perl 2.0 Sucks, Why mod_perl 2.0 Rocks
Geoffrey Young
Have you tried working with mod_perl 2.0 yet? Ugh. With all those new classes and directives to learn, not to mention the list of incomplete features, you might as well stay with the trusty, stable mod_perl of old. And subroutine attributes? Eesh. Of course, the new 2.0 API does let you do fun stuff like write output filters. Oh, and there's the Apache-Test framework that's pretty cool. Not to mention a method called assbackwards(). This brief, fun talk will introduce mod_perl 2.0 by poking fun at its shortcomings as well as showcasing its promise.

Writing Tests with Apache-Test
Geoffrey Young
Tests make your life easier, and Apache-Test makes writing live webserver tests easy. The Apache-Test framework is arguably one of the best things to emerge from the mod_perl 2.0 redesign effort. All you need to do is write the tests and *poof* Apache-Test takes care of configuring and starting the server, running your tests, stopping the server, and reporting back your successes (or failures). This talk will introduce the Apache-Test interface and detail how to let it make your life easier. We will step thought the processes of writing a complete test suite for a simple Apache:: module, from generating the Makefile.PL to deciding which aspects of our module ought to be tested -everything you need to be able to start writing tests for your neglected web applications.