Without a title

2
A Perl5 Overview and Tutorial

A Perl5 Overview and Tutorial

New Features in Perl5
Extended Perl5 Tutorial
Traps for Perl4 Programmers Migrating to Perl5

The latest major version of Perl is Version 5, and because most or all the examples in this book require it, I dedicate this chapter to trying to teach you a bit about how it works and what's different about it, as compared to its predecessor, Perl4.

Perl5 is now into its third (and probably fourth, by the time this book is published) public release. The early stages of development were fast-paced and difficult to keep up with, but the specifications have settled, and most of the core functionality has stabilized, both in its implementation details and in its usability.

NOTE:

When I refer to the major version, I mean the first number in the version specifier--for example, the 5 in 5.002. The public release number, in this context, is 002. As of this writing, Perl is currently at revision number 5.003_05. The _05 delineates a development patch suite that postdates the official 5.003 release. The next public release, 5.004, will evolve from at least one additional development patch.

In the following sections, I'll introduce you to the new features that are available to the Perl programmer using Version 5. First, we'll take an extended look at some of the more important Perl5 features, as they relate to this book. Next, I'll give you a detailed explanation of reference variables, describing how they work, and what they're used for. Then we'll take an in-depth look at modules and extensions for Perl5. I'll describe how to use them, where to get them, and why you should use them in the first place. Of course, you will need to use various modules extensively throughout this book, in most or all of the examples. There are, however, other, more fundamental reasons for using modules, and I'll emphasize these throughout the book

NOTE:

I provide for you, on the accompanying CD-ROM, the latest version of the modules used in this book, as of the date of the assembly of the materials on the CD-ROM. You'll need to update them as new versions are released. You can find more details on updating Perl modules and extensions later in this chapter. Also see the description of the CPAN, in Chapter 1.

Towards the end of this chapter, I'll show how you can use object-oriented techniques to implement your own customized functionality in your Perl programs that use modules or extensions, with a minimal amount of additional code. Finally, I close the chapter with a description of the "traps" to be wary of when you're using Perl5--for example, things that worked one way with Perl4 and now work differently, or not at all, with Perl5. Not too many of these traps exist, and they're documented, so don't worry too much about them, especially if you're starting from scratch and don't need to convert any Perl4 programs to Perl5.

I make some assumptions in this chapter and, as previously mentioned, in this book. Namely, I must assume that you have used Perl and written programs in the Perl language. You should be familiar with most or all the Perl4 data types, operators, and syntax. Furthermore, I'm not going to try to introduce all of Perl5 here. That is in the realm of a complete book or books, some of which have recently been released. My primary goal in this chapter is that you gain the familiarity you need to understand the examples given within this book, and that you know enough when you're finished reading it to implement your own programs and Web functionality using the techniques described.

New Features in Perl5

You can use a number of new features and enhancements with Perl5. Many of them are utilized in the modules demonstrated in this book. In the following sections, I provide a short overview of most of them.

Usability and Simplicity

Some major improvements have been made to Perl, in terms of the layman's ability to use it and understand it. While it's always been a tool for the common man, its latest release has seen major work towards making it even easier to use and understand. Let's see how this was accomplished. Enhanced Documentation Probably the most significant improvement to the Perl distribution, outside of Perl itself, is the documentation. The single monolithic manual page has been split up into logical sections corresponding to the various aspects of Perl programming, along with sections related to the more advanced aspects of Perl, like embedding the Perl interpreter in an external program, and other sections as well.

A Simple Convention

As you read through this and other chapters, you'll notice the capitalized references to the various sections of the new manual, of the form PERLBLAH, where the BLAH corresponds to the section of the new Perl manual that is being referred to. Such references are there to help you find your way to the related sections of the Perl manual, regarding the current subject matter.

The Perl manual now has a total of 32 standard sections, each with a specific intent. Table 2.1 lists them all.

Table 2.1. Standard Perl manual sections.

PERL Perl overview

PERLTOC Perl documentation table of contents

PERLDATA Perl data structures

PERLSYN Perl syntax

PERLOP Perl operators and precedence

PERLRE Perl regular expressions

PERLRUN Perl execution and options

PERLFUNC Perl built-in functions

PERLVAR Perl predefined variables

PERLSUB Perl subroutines

PERLMOD Perl modules

PERLFORM Formats, and using write()

PERLREF Perl references

PERLDSC Perl data structures intro

PERLLOL Perl data structures: lists of lists

PERLOBJ Perl objects

PERLTIE Perl objects hidden behind simple variables

PERLBOT Perl OO tricks and examples

PERLIPC Perl interprocess communication

PERLDEBUG Perl debugging

PERLDIAG Perl diagnostic messages

PERLSEC Perl security

PERLTRAP Perl traps for the unwary

PERLSTYLE Perl style guide

PERLXS Perl XS application programming interface

PERLXSTUT Perl XS tutorial

PERLGUTS Perl internal functions for creating extensions

PERLCALL Perl calling conventions from C

PERLEMBED Perl: how to embed Perl in your C or C++ application

PERLPOD Perl plain old documentation

PERLAPIO Perl internal IO abstraction interface

PERLBOOK Perl book information

The documentation also contains numerous additional sections corresponding to the standard modules that ship with Perl. Most or all of these additional sections are extracted from the embedded POD, which is to be found in the module file itself. All Perl documentation is written first in POD and then translated to other formats.

POD

Plain Old Documentation. ASCII text documentation with markers corresponding to the various formatting elements. Can be embedded directly into Perl modules. See PERLPOD.

You can easily transform POD into standard UNIX *ROFF format, HTML, and a number of other formats by using the pod2* (pod2man, pod2html, pod2text, and so on) converters. There exist POD converters to many other types of formats, as well. The POD format also implies that you can read the documentation directly, without any post-formatting at all. Everything that I cover in this chapter is also documented in the Perl PODs, and I give references to the specific sections as I go along, using the PERLBLAH notation as previously mentioned. POD can be embedded directly into a Perl module, or program, and nearly all of the modules have them already.

When Perl is installed on a typical UNIX site, the POD documentation, including POD from the modules, is converted automatically into standard UNIX manpages. The administrator usually installs it inside the primary @INC directory, usually in a subdirectory called man. Macintosh Perl installations have the POD, converted to HTML format, in a folder beneath the folder that contains the Perl application, named pod. The Windows (ntperl) installation also has the documentation, converted to HTML, in the directory called docs. You should find this directory, and be ready to refer to the documentation within it.

As I mentioned previously, conversion tools are available for POD, including pod2html, pod2text, pod2rtf, pod2tex, pod2inf, and now even pod2ps, and the pod2man program. You can use any of them to convert from POD format to your preferred format, provided that the tool has been written to work on your architecture. All these tools work on UNIX, and some are configurable to work on other platforms.

A Note on Compatibility

There are, as you might guess, still a few incompatibilities when attempting to produce cross-platform Perl code, and not all scripts run on all architectures by default. I'll try to note these, as I progress through this chapter and through the book. Filepaths, for instance, shouldn't be hardcoded in Perl programs, but often are. This issue in general is being considered and worked on by the Perl Porters, a large group of brilliant people who help to bring you Perl.

Remember to check your favorite CPAN to get the latest versions of the pod2* programs.

CPAN

Comprehensive Perl Archive Network. A large group of well-connected Internet sites that maintains a copy of the master Perl archive. You can find more details later in this chapter and a complete history and description of the CPAN in Chapter 1.

Readability Improvements The ability to provide easily readable and reusable code has become more important as the level of formal training and skills required to start a Web site has decreased. The responsibility is largely up to the script author to implement this readability. Perl5 also provides some new functionality that enhances the capability of the script author to do so.

The English module increases the readability and understanding of Perl code, and it is a big step toward alleviating the boggling effect that raw Perl code sometimes has on new programmers. The English module provides a mapping between Perl's eclectic punctuation (special) variables with an English name corresponding to each one. The regular-expression variables that correspond to the three components of a matched string, for example, are often difficult to remember, even for the experienced Perl programmer. The English module maps these variables as follows:

*MATCH = *& ;

*PREMATCH = *';

*POSTMATCH = *';

Thus, when you use the English module in your program, you can use $MATCH, $PREMATCH, and $POSTMATCH instead of using $&, $', or $' and chasing through the manual to verify whether you need an ampersand, backtick, or single quotation mark following the $, each time you want to access these built-in variables. See the complete English.pm module in @INC, and its embedded POD documentation, or English.3, the POD converted to a manpage, for more details. New Logical Operators The new logical operators and, or, and not enable you to avoid using the &&, ||, and unary ! operators, respectively. The new operators are definitely more readable, to the casual observer at least. The former also have lower precedence than the comma, and certain other low-precedence operators. Consider the following:

$foo/=0 || print "aak";



# prints: aak

Using ||, it looks like you can divide by 0. (Actually you're dividing by the ||'d value of zero and one, the return value from print.) Now consider the same example, using the or operator:

$foo/=0 or print "aak";



# prints: Illegal division by zero

Using or (and Perl5) gives the expected result. See PERLOP for more details on operator precedence. Warnings and Stricture Other usability enhancements in Perl5 include improvements to the -w command-line switch, which now gives more useful and informative output. You should use it with all your programs. See PERLRUN for a complete description of all command-line switches.

Also new is a set of pragmatic modules, which impose certain restrictions and perform extended type and syntax checking on your code at compile time, to potentially help you find bugs before they bite. Use of these pragmas within your code is also highly recommended. See PERLDSC and PERLMOD for more details on the motivation for and impact of using the strict modules. The New => Operator The => operator is syntactic sugar for a comma. It makes certain declarations look prettier and appear more sensible, as in the following example:

%hash = (

    `Name' => `Joe',

    `Address' => `123 Foo Street',

    `City' => `San Francisco',

    `State' => `CA'

);

Note how we used the comma at the end of each hash key/value pair, but the => operator between them. This makes such declarations easier to read, especially when declaring more complex data structures. Function Prototypes Function prototypes are one of the very latest new features in Perl5. They were finally included as of Perl5.002, after a great deal of consideration and discussion. Essentially, they provide you with a means to assure that the correct arguments are passed to your subroutines, to emulate the behavior of built-in commands, for instance. See PERLSUB for more details.

Lexical Scoping

The new my() operator enables you to declare variables that are truly lexically scoped. Previously, the closest you could get to a lexical variable was with local(). The my() variables are visible only within the current block (within a set of curly braces) and go out of scope immediately after exiting the block. The following example illustrates their use:

$foo = "I\'m foo in a global context\n";

print $foo;



{  # the curlies open a new lexical context (block)

    my $foo = "But I\'m bar in this short lexical context\n";

    print $foo;

} # lexically scoped $foo goes out of scope here



print $foo;



# prints:

I'm foo in a global context

But I'm bar in this short lexical context

I'm foo in a global context

The my() variable is used extensively in the modules that you'll explore in this book. See PERLVAR for complete details on all types of Perl variables, including lexical variables.

References

Standard Perl scalars can now be used to refer to other data types. References act much like pointers in C. In the general sense, they simply refer to other data types. When a reference is blessed into a package, it becomes a Perl Object, and can be used to invoke the methods of its package (or class), as well as to access the instance variables of the class, something akin to C++ references.

Because we'll explore the references in depth later in this chapter, I'll defer most of the examples, and a complete description of them until that time. I'll also explain how the bless() operator works at that time. The reference datatype is also used extensively in the examples in this book.

Data Structures

Nested data types and other data structures are now possible, using references. You'll also learn about this topic later when we look at references in depth. A simple example that implements an array of arrays follows:

@foo = ( 1, 2, 3);

@bar = ( 4, 5, 6);

@arrayref = (\@foo, \@bar);

print "The third element of \@foo is $arrayref[0][2]\n";



# prints: The third element of @foo is 3

Here, I specify the zero'th element of @arrayref, which is an array of references to arrays, and then use it to dereference the third element of @foo. @arrayref actually looks and feels just like a two-dimensional array, but it's important to understand that it isn't.

Perl does not support multidimensional arrays. All arrays are still flat. By using references, however, you can emulate multidimensional arrays and build up complex data structures. See PERLDSC for a compendium on Perl data structures.

Modules and Libraries

Perl5 provides you with a number of modules and extensions that contain packages (classes). Using them simplifies and enhances the process of creating new Perl programs. These modules are akin to C++ class libraries, in many ways, but are still simply Perl packages, in the end. The older Perl4 libraries and packages are, in most cases, still available, but many have been rewritten as modules where appropriate.

Working with Perl modules is the basis of this book. I supply very little code that is new. The idea is that, once something has been developed to perform a task, you should make use of it, or you're wasting your precious time.

TIP:

"Don't reinvent the wheel." Although this advice sounds trite, it's important to the continued development of Perl5 as a viable object-oriented language. The authors of the many useful Perl5 modules have spent a lot of time and effort to provide their modules. Because this voluntary contribution is the foundation of the Perl development effort, it should be nurtured and utilized to the fullest extent. I suggest that you try to utilize these modules and help out where you can, by reporting bugs and providing fixes and enhancements back to the author where appropriate.

Reusability

Using the new object-oriented features and techniques of Perl5 to develop new programs enhances the potential for reusability of your code. Well-written programs can be adapted to serve multiple purposes, possibly even being promoted to the status of libraries or modules. (You'll learn how to register and/or submit your code as a new module later in this chapter.)

Object-Oriented Capabilities

You can use the Perl module with references to achieve an object-oriented look and feel in your programs. You can create inheritance (single and multiple), relationships, virtual classes, constructors, destructors, and implement simple messaging with Perl5.

I cover some of these techniques within the extended portion of this tutorial. If you're not familiar with object-oriented programming, refer to the many books available, which describe the general techniques and the extended methodologies.

Extensible and Embeddable

The Perl modules provide a reusable interface for many commonly used programming tasks. Many of the modules are also Perl extensions, which means that some component of their interface is actually written using the Perl XSUB language. After the XSUB code is translated to C, it is compiled with a C compiler. The functions in the C code are then accessible as Perl subroutines in your program.

Perl5 is now embeddable, as well. A programmer can create a Perl interpreter in any regular program written in C and, through the use of CallBacks and other internal routines, interface to the Perl interpreter within his or her program. This capability can provide an extremely powerful set of additional features for editors, servers, and other tools.

NOTE:

The details regarding XSUB programming and creation of shared libraries are beyond the scope of this tutorial. In this book, I don't tell you more than what you need to use them in your Perl programs. I assume that they've been built and installed on your machine. Likewise for embedded Perl, I only mention it here as a new feature.

Souped-Up Regular Expressions

The Perl regular expression--and its associated functions--remains one of the most powerful and useful features of the overall language.

A Convention

The words "Regular Expression" from here on may be shortened to "RE."

All of the older Perl4 RE functionality remains in Perl5, and several capabilities have been added.

Possibly the most interesting and usable of the newer features is the extension syntax that is now available for regular expressions. The extensions that are currently available with Perl5 enable you to embed comments in your pattern, do grouping without backreferences, perform zero-width positive lookahead assertions, and even have embedded pattern-match modifiers. See PERLRE for details on all the powerful new RE features.

Enhanced Safety and Security

Perl5 integrates the TAINT features into the single Perl binary. All the runtime checks and assurances remain when executing an SUID script, and you can even turn on the TAINT features from the command line by using the -T switch. No additional binaries need to be executed when running an SUID script. Naturally, this applies only to architectures that support multiple user IDs, such as UNIX, and safe SUID scripts. See PERLSEC for more details.

Warning! SUIDPERL patch is necessary for older versions.

We'll mention this again in the security chapter, but it's important enough to say right now. If you're using SUIDPERL, you need to be sure to apply the patch that was released with Version 5.003. There's also a patch for older Perl4 SUIDPERL versions. You can get the patch at your nearest CPAN.

A large number of new Perl modules provide the programmer with a rich new toolset for dealing with security issues. Some modules provide encryption and WWW administrative tasks, whereas other modules enhance standard system administration programming tasks. Probably the most important security module is the Safe module, which allows you to selectively enable or disable certain Perl operations within a program. You learn more about these modules in Chapter 3, "Security on the Web."

Other New Features in Perl5

Now let's take a look at some other new features that you get with Perl5. Newer ones are being considered on an ongoing basis, and may not be mentioned, but the ones that follow are now formally part of the language. BEGIN and ENDRoutines The BEGIN and END statements provide the scriptor with a means to implement certain functionality in the Perl program as it is being compiled or just after it exits. Anything you place within the BEGIN{} block is guaranteed to be executed before any other statement in your program. It executes at the time the program is being compiled. The following gives a simple example:

print "done that\n";

BEGIN { print "been there, "; }

# prints: been there, done that

Likewise, anything you place in an END{} block is executed after every other statement in the program has executed and just before Perl exits. Multiple END{} blocks are executed in the reverse order of execution, as follows:

END { print "I\'ll see you in my dreams\n"; }

END { print "Irene.  "; }

print "Goodnight, ";



# prints: Goodnight, Irene.  I'll see you in my dreams.

use() and no() Statements The use() statement imports symbols and/or semantics from the named package into your program by aliasing subroutines and/or variable names from the package into the current package's namespace. When you say

use Module;

in your program, it's the same as saying

BEGIN{ require Module; import Module List};

where Module has exported some List of methods (subroutines) via its @EXPORT or @EXPORT_OK
arrays, and you're making them part of your current package namespace (main, in this case). Note that the import method is not a builtin, but is actually a method itself, from the Exporter module. Much more on this later in this chapter.

You can also specify that nothing be imported by giving an empty list to use(), like this:

use SomeModule ();

Then you have access to all the methods of the package but only via the full name of the method or a blessed reference to the package. I discuss these details later in the chapter.

When you use one or more of the pragmatic modules, the use() statement imports semantics instead of symbols. For instance,

use strict subs;

This statement causes a compile-time error if you try to use a bare word identifier that isn't a subroutine, unless it appears in curly braces or on the left side of the => operator. These pragmas are then in effect through the end of the file or until the corresponding no() statement turns them off, as in the following:

no strict subs;

See PERLMOD and PERLDSC for more details on the pragmatic modules, which were previously mentioned when describing the new warnings and stricture, and PERLVAR for the official definition of use() and no(). There's also the POD within strict.pm for full documentation of the strict pragmas. The strict.pm module is installed with the rest of the Perl modules in @INC. The New ::Operator and the-> Operator These :: and -> operators are provided as a means for invoking the methods within a given package. You should use the Package::subroutine() syntax instead of the older, deprecated Package'subroutine() syntax; this new syntax provides the same semantics. This operator also works when you're accessing variables within a package, such as $Package::scalar, @Package::array, and %Package::hash, just like the older, single tick operator does. Again though, the single tick operator may not be part of the language forever. It's better to use the :: operator in all new code.

The -> operator provides a means to invoke the methods of a class or package, along with any methods of any of its parent classes, using a blessed reference to the package as well as serving as a post-fix dereference operator for any general reference. You will learn about this operator in detail later in this chapter. \U,\L, uc(), lc(), ucfirst(), and lcfirst()Operators The \U, \L, uc(), lc(), ucfirst(), and lcfirst() operators enable you to operate on strings, modifying them to be all uppercase, all lowercase, or have just the first letter of the string be uppercase. The \U and \L operators work within the double-quoted string, and the others take strings as arguments and return modified copies. For example,

$ucstring = "FOO";

$lcstring = lc($ucstring);

print $ucstring, ` `, $lcstring, ` `, ucfirst($lcstring),"\n"



#prints:

FOO foo Foo

Closures A closure is implemented as an anonymous subroutine or a reference to a subroutine. It is generally declared using a reference, which is then used to invoke the closure. Consider the following example:

sub newanon{

my $foo = shift;

my $anonsub = sub {

    my $arg = shift;

    print "Hey, I'm in an $foo $arg\n";

};

return $anonsub;

}



$closure = &newanon(`anonymous');





{ # new lexical scope



&$closure(`subroutine');



}

# prints: Hey, I'm in an anonymous subroutine.

A lexical variable declared within a closure remains intact in future invocations of the subroutine, even if invoked outside the lexical scope of the declaration. See PERLSUB and PERLREF for detailed examples and descriptions of closures. Multiple Simultaneous DBM Implementations The Perl programmer can now access a number of DBM implementations simultaneously within a program. Perl is now shipped with SDBM, and consideration is being given to including the Berkeley DB implementation by default with Perl, but this has not been implemented yet. GDBM, NDBM, and ODBM are also available, if you have them on your machine. Having simultaneous DBM implementations makes it easy to convert from one DBM format to another within the same program.

You should note that the older dbmopen() function has been deprecated in favor of the tie() function. See PERLFUNC for more details on implementing a tie()'d DBM hash. Flags on #! Line Any regular Perl command-line options (flags) appended after the

#!/usr/bin/perl

line in a program are now correctly interpreted, even if the script isn't executed directly. The startup line:

#!/usr/bin/perl -d

will, for instance, invoke the debugger each time the script is run.

Summary of the New Perl5 Features

Perl5 is easier to learn and use and is clearly more powerful than previous major versions of Perl. Other new features, modules, and documentation that enhance Perl's usability are also available but haven't been mentioned here. You should explore them all as time allows.

Extended Perl5 Tutorial

Now that you've been introduced to the new features in Perl5, you're ready to embark on an extended tutorial on references and modules. You need to understand how these elements work so that you can make use of the examples to follow in this book. There's a lot to cover here, so grab a cup of coffee, and I'll try to avoid monotony. You might find it helpful to be sitting at your computer with your copy of Perl5 ready to run so that you can try out the sample code as you go along.

References

In the past, the Perl programmer had to go through some contortions to implement various complex data types, such as arrays of arrays. The notion of a variable that "pointed" to another data type did not exist. With the advent of Perl5, you now have the reference variable type. References are actually just standard Perl scalar variables, which are assigned or initialized to allow them to be used to refer or "point" to some other Perl data type. References give you powerful new capabilities when writing Perl programs.

To create a new real reference variable, you use the following general syntax:

$variable = \datatype;

Here, you set $variable to be a reference to datatype by preceding datatype with a backslash. $variable can now be used to refer or assign to datatype using an explicit form of a dereference, depending on what datatype is. Table 2.2 illustrates the syntax for using real references. A number of other types of references also exist, each with its own assignment syntax, but I won't explain those types just yet. Table 2.2. References: Data types and assignment/dereference syntax.

Data Type Assignment Syntax Dereference Syntax

Scalar $ref = \$var; $$ref

Scalar Array $ref = \@array @{$ref} or ${$ref}[0] for individual elements

Hash $ref = \%array %{$ref} or ${$ref}{key} for

individual elements

Reference $refref = \$ref $$$ref

Subroutine (CODE) $ref = \&sub &$sub

Package bless $ref, Package $ref->method() $ref->variable

In the following sections, you'll look at each of these data types in depth, and I'll demonstrate their use with some examples. These examples should provide you with some general insight as to how each type of reference can be used, but note that they are not comprehensive. You can study the full power and capabilities of references by reading PERLREF and PERLDSC. Another potentially useful document for studying how references work is the test script for references in the Perl distribution, called ref.t. Under UNIX, you can find it in the t/op directory under the Perl build directory. Under Macintosh, the t/op directory should be located within the installation directory. Under Windows 95, this directory is named ntt, and the file has been given the .ntt extension. Look for ntt/op/ref.ntt instead of t/op/ref.t under Windows(ntperl). It contains a complete test suite for all types of Perl references. References to Scalars Scalar variables, the simplest type of Perl variable, can be referenced, as can all other types. Although the usefulness of references to simple scalars may not be immediately evident, referencing is certainly an option.

Consider the following example:

$foo = "Initial value";

&update_scalar();

print $foo,"\n";



sub update_scalar{

    $foo = "Updated";

}



# prints: Updated

This example takes a global variable, $foo, and sets it to an initial value; then it calls the update_scalar subroutine to set it to a new value. Simple enough, but if the update_scalar subroutine lives in a package, you're out of luck. Observe the following:

$foo = "Initial value";

&test::update_scalar();

print $foo,"\n";



package test;

sub update_scalar{

$foo = "Updated";

}



# prints: Initial value

Here, the $foo variable doesn't get changed because the test package has its own namespace and its own $foo variable, and can't access the $foo in main without some specific semantics. When you work with modules and packages, you'll be faced with this restriction.

So, what to do? Well, you could pass in the $foo from main as a parameter to the subroutine and try to update it within the subroutine like this:

$foo = "Initial value";

&test::update_scalar($foo);

print $foo,"\n";



package test;



sub update_scalar{

($foo) = @_;

$foo = "Updated";

}



# prints: Initial value

Alas, the $foo that gets updated in the update_scalar subroutine is just a copy of the $foo that is passed in. You're still dealing with two specific variables, in different packages, and you're essentially passing by value when you make a reassignment within the subroutine. The experienced Perl4 programmer will recognize that there's also the option of modifying $_[0] directly, but references provide a cleaner solution.

The solution I've chosen, using a reference to a scalar, is to create a reference to main's $foo, pass it into the update_scalar subroutine, and then dereference for the assignment, as follows:

$foo = "Initial value";

&test::update_scalar(\$foo);

print $foo,"\n";



package test;



sub update_scalar{

($foo) = @_;

$$foo = "Updated";

}



# prints: Updated

Notice how you implicitly pass the reference to the subroutine by using the backslash operator on main's $foo variable in the subroutine call. You thus pass main's $foo by reference to the update_scalar subroutine, and when you assign it to the $foo in the subroutine, you are actually creating a real reference to the $foo in main. Using the dereferencing syntax described in Table 2.2, you then can change $main::foo implicitly through the reference, using the $$foo dereferencing syntax.

References to scalar types have many uses; this simple example describes only one. You'll see others as you continue to read through the chapters of this book. References to Scalar Arrays Scalar arrays are arrays of Perl scalar types. You declare them using the @name syntax. Using a reference to the scalar array enables you to access the elements of an array individually or refer to the entire array, as shown in Table 2.2. Of course, you also can use the reference anywhere an array is expected, such as within a foreach() loop. The following example again illustrates the usefulness of references when passing arguments to subroutines. Consider the following code:

@array1 = (1, 3, 5);

@array2 = (2, 4, 6);

Now, what if you want to pass these arrays to a Perl subroutine and then access them within the subroutine, possibly modifying their values? If you've ever tried to pass two or more arrays to a Perl subroutine, then you know that it can't be done easily, because there's no way to determine where the first array ends and the next one begins. (Recall that the parameters passed to a Perl subroutine are accessible only through the @_ array and thus appear to be a single array to the subroutine that receives them.)

Using references, you can circumvent this limitation. If you create references to each of the preceding arrays, you can easily pass two scalars to the subroutine and then dereference the arrays those scalars have been assigned to, like this:

@array1 = (1, 3, 5);

@array2 = (2, 4, 6);

$ref1 = \@array1;

$ref2 = \@array2;

@sum = &array_adder($ref1, $ref2);

print "\@sum = (", join(`,',@sum), ")\n";



sub array_adder{

my($ref1, $ref2) = @_;

my $i = 0;

my @sum;

for($i = 0; $i <= $#{$ref1} ; $i++){

    $sum[$i] = ${$ref1}[$i] + ${$ref2}[$i];

}

return @sum;

}



# prints: @sum = (3,7,11)

Here, you've created a new array, whose elements are the sum of the individual elements of two equal-length arrays' elements. That's easy, but you do it by passing the arrays to a subroutine using references and thus make a formerly difficult, or at least nonintuitive, task easier. In Perl4, you would have had to either use glob types or have passed in the length of the arrays as the first or last argument and then split @_ appropriately. Not pretty.

Note how you are able to use the reference within the subroutine in the $#array context (the highest index of the array from zero-base), as well as access the individual elements of the arrays that are being referred to. Again, this is just one single use for references to arrays. See the documentation mentioned previously for many more examples, PERLLOL for instance. References to Hashes (Associative Arrays) When you create a reference to an associative array (hash), you can access all the keys and values of the associative array through the reference. You can also use the reference in place of the hash, using the syntax in Table 2.2, within any given Perl function that operates on associative arrays or their elements, such as keys(), foreach(), and delete().

Hash references are extremely powerful. Using them, you can build up complex data structures containing all the Perl data types or references to those types. In the following example, you use the standard assignment/dereference syntax described previously and one of the other dereferencing syntaxes:

@array = (1, 2, 3);

%Hash = (

    `foo'=> `bar',

    `aref' => \@array,

    `internalhash' => {

                       `birds' => `duck', 

                       `plants' => `tomato'

                      }

);



# print out the simple scalar element

print $Hash{`foo'}, "\n";



# print out the elements of @array

print join(` `,@{$Hash{`aref'}}),"\n";



# print out the elements of the %internalhash

foreach $key  (keys( %{$Hash{`internalhash'}} )){

    print "Key is $key, value is $Hash{`internalhash'}->{$key} \n";

}



# prints:

bar

1 2 3

Key is plants, value is tomato

Key is birds, value is duck

Note how you dereference the value corresponding to the internal hash key of %Hash by using the -> dereferencing operator. You can use -> because the value is itself a reference to an anonymous hash. Because it's a reference, you can use it as a regular scalar element of the array and still emulate a multidimensional hash.

CAUTION:

Remember, in spite of appearances, arrays are always one dimensional within Perl. You're only emulating multidimensionality here by using references. See PERLDSC for more details.

References to References References, like any other Perl data type, can have references to themselves. You may, for instance, have several references to various types or data structures that you want to group together under a single reference. Here's an example:

$scalar = "a string";  # a regular scalar

$array = [1, 2, 3];    # anonymous array

$hash = {"foo" => "bar", "baz" => "blech"}; # anonymous hash

$scalarref = \$scalar;  # ref to scalar

$refref = [\$scalarref, \$array, \$hash];  # anon array of refs to refs



# print out the contents of the ref to ref to scalar

print $${$refref->[0]},"\n";



# print out the elements of the ref to ref to array

print join(` `,@{${$refref->[1]}}),"\n";



# print out the elements of the ref to ref to hash

foreach $key  (keys( %{${$refref->[2]}} )){

    print "Key is $key, value is ${${$refref->[2]}}{$key} \n";

}



# prints:

a string

1 2 3

Key is foo, value is bar

Key is baz, value is blech

You can go as deep as you like with references to references. Of course, at some point, readability may suffer. I recommend readability over complexity in most cases, especially in a public interface to a module or library, or within code that may require modifications by someone other than yourself in the future. References to Subroutines The last type of standard reference to look at, before getting to blessed references and object-
oriented Perl, is the reference to code. Specifically, let's look at how to set up and use a reference to a subroutine.

References to subroutines may be useful in a number of situations. You can use them to implement closures, as previously described, or as subroutine parameters, or as part of complex data types. Subroutine references are also useful within packages, but note that you can't take an external (to a package) reference to a subroutine which is within the external package, because of inheritance. See PERLSUB for more details.

The following example illustrates a simple case in which you set up an array of references to subroutines in the same package. Here, you use two types of references to access the subroutines:

sub foo{

    return "I\'m in foo now\n";

}



sub bar{

    return "Here I am in bar\n";

}







%subrefs = (`foo' => \&foo, `bar' => `bar', ); # bar is a "fake" reference



while(($key,$ref) = each(%subrefs)){

   print ${key}, " : ", &$ref;

}



# prints:

foo : I'm in foo now

baz : Here I am in baz

You set up a single hash to contain multiple references to subroutines. The potential for dynamic runtime decision trees should be evident. You could arbitrarily assign references to subroutines at runtime, based on some input parameters, for instance, then execute the subroutines using the references. You could have easily done the same with a regular scalar array, emulating a C-style array of pointers to functions--not as powerful as a hash of function pointers, but potentially useful.

Note how, in the preceding example when setting up the reference to the bar subroutine in the %subrefs declaration, I didn't use the \ operator to prepend it. Instead, you use a fake reference, which is another way to access a given data type by its name. This technique could also be used in Perl4; however, it works for any data type in Perl5, and it is an actual reference in Perl5. You can easily declare another (fake) reference to fake reference, to any depth. The ref.t test suite has a nice example for using fake references. I won't discuss them much more, because they're not widely used, but they're worth noting. References to Packages (When Blessed) When the reference variable refers to a Perl package and has been blessed into the package, as shown in Table 2.2, it is known as a Perl object. It can then be used to store, and allow access to, any Perl data type that is used by the package, akin to public instance variables being accessed by a reference in C++.

You also can use the blessed reference to invoke the methods of the package, also analogous to a C++ class reference. When you use a blessed reference in this way, the method that gets invoked automatically has access to the object. This technique is very common in the Perl modules, and it's very powerful. You need to have a clear understanding of it in order to use and reuse the examples in this book and modify them to suit your needs.

Let me give you a simple example to illustrate the concepts. In this example, the object is a reference to hash. You could simplify it to a reference to array or scalar, if it were appropriate. A hash reference gives the most flexibility to access and grow the object dynamically. Consider the following:

package Customer;



sub new {

my $type = shift;

my %args = @_;

my $self = {};



$self->{`Name'} =

    length($args{`Name'}) ? $args{`Name'} : "No name given";

$self->{`Vitals'} =

    defined(@{$args{`Vitals'}}) ? $args{`Vitals'} : ["No vitals"];

bless $self, $type;

}



sub dumpcust{

my $self = shift;



# Print out the values for the object

print $self->{Name},"\n";

print join(` `,@{$self->{Vitals}}),"\n";

}



package main;



# Create a new customer object

$cust  = Customer->new( `Name' => "Billy T. Kid",

        `Vitals' => ["Age : 42", "Sex : M"] );



# Invoke the method to print out the values for the object

print $cust->dumpcust;



# prints:

Billy T. Kid

Age : 42 Sex : M

In this simple example, you see how to initialize the Perl object, which is a reference to hash, with both scalar elements and a reference to a scalar array. Then you invoke the dumpcust() method from the Customer package or class, using the blessed reference.

I'll continue to develop this example as we progress into the extended study of the Perl module, and the object-oriented features of Perl programming.

An In-Depth Look at Perl5 Modules

You're now ready to take an extended look at Perl5 modules. In the following sections, I'll give you a short history of the extensibility of Perl to demonstrate the usefulness of modules. Then we'll look at the general style and process for implementing a module.

The Short History of Perl Extensions

In earlier versions of Perl, a number of external APIs, or extensions, were added to Perl to give it additional features or functionality. This was accomplished by compiling the Perl source code with source code that "glued in" the desired functions from the API. Linking with the external libraries for the desired API then created a new Perl executable. The following are a few examples:

cursePerl: UNIX terminal control routines
tkPerl: Tk routines for screen graphics and widgets
sybPerl: Sybase database manipulation routines

Each of these executables had to be separately maintained by the person at the site who compiled Perl, and each time one of the APIs was revised, that person had to rebuild the complete API-specific Perl executable, relinking it with the new library and working out any problems that might show up with the newer version. Perl5 has outdated this tedious process. Nowadays, when you want to add features to Perl by linking with some external library, you use a completely different techniques.

Modules and Extensions: Purpose and Design

Version 5 of Perl implements the module as the standardized means for extending its functionality. A Perl module is very much like a Perl library in that it provides the user with a set of functions or variables, within a separate namespace known as a Perl package. The purpose of these functions or variables is to simplify and standardize the implementation of a given task or tasks.

Modules can be much more powerful and useful than the old-style Perl libraries, however. First, they can now be implemented as a shared library and thus, on many platforms, eliminate the need for a separate, statically linked Perl executable for each external API that is desired as a Perl extension. Second, they can be implemented to provide an interface for use in the new, object-oriented way. Extension Modules: Modules That Interface to External APIs A Perl5 module provides the implementor with the means, through the use of XSUBs, to compile C code that "wraps" or "glues" the functionality of an API into a shared library that can be loaded by Perl at runtime. After Perl loads the shared library, through a use or require statement, the glue code enables the end user to call the API's functions from within the Perl program, passing Perl data types in and out. This capability is one of the most powerful and useful new features in Perl5.

NOTE:

Not all architectures support shared libraries. Many do, but on the ones that don't, you must still link the API libraries with Perl to produce a Perl executable that provides the interface to the API.

Modules: The Object-Oriented Way Another purpose of modules, briefly demonstrated and discussed previously, is to provide the ability to use them combined with Perl references to create object-oriented programs. You'll see and use the object-oriented features of modules extensively in this book to implement the examples and demonstrate the techniques that we consider to be the most up-to-date and cutting-edge for Web programming.

In the next section, you'll take a brief look at the general form of a module.

The Perl5 Module: Form and Function

The typical Perl module is a file that lives in the Perl library directories, @INC, and has Perl code that can be imported into your Perl program, through the use or require statements. If you've used Perl4, then you're probably familiar with require. The use() statement implies a BEGIN{} block, and thus the importation of the module's symbols occurs immediately. In general, the use() statement gives preferred functionality over the require statement, but they're each used widely. See PERLMOD for more details.

After the code from the module is loaded into your Perl program, it can be used as if it were part of your program. Thus, a module typically provides useful subroutines (methods) that you can call, possibly passing in Perl variables and receiving back a simple result or modified or new variables. Such behavior is typical of both Perl modules and libraries. A Simple Extension Module A typical Perl5 module that provides an extension to an external API looks something like
Listing 2.1.

Listing 2.1. A simple extension module.

package Foo;

=head1 NAME



Foo - Perl module that simplifies and standardizes the

process of accessing the Foo API.



=head1 SYNOPSIS



    use Foo;

    Foo::FooPrint(args);

    Foo::FooGet(args);

    FooEvery(args);



=head1 DESCRIPTION



This module may be used to access the FooLoad(), Print(),

and Get() functions of the Foo API.



=head1 AUTHOR



I. B. Guru <guru@gurus.org>



=cut



require 5.000;

require Exporter;

require DynaLoader;

@ISA = qw(Exporter DynaLoader);

@EXPORT = qw(FooLoad);

@EXPORT_OK = qw(Print Get);

bootstrap Foo;



1;

__END__

At the top of Listing 2.1, the package name is specified. All the symbols that reside in the module live in this namespace, unless another package name is specified farther down into the module, in which case everything following that package statement will live in the newly declared package. A module can declare as many package statements internally as it wants, but the name of the file where the module lives must correspond to a component of the first use() statement in the program that uses it.

Ideally, the package statement is followed by an embedded POD section, which documents the functionality and use of the package's subroutines and variables. Every well-written module should have some documentation, minimally describing usage of the internal methods and hopefully providing at least one example.

Next, you get to the initialization statements, which cause the module to import methods and/or instantiate objects from other modules. The most commonly used module within other modules is the Exporter. The Exporter defines a standard set of methods that give a module the ability to export its methods and variables into Perl programs, or other Perl modules that use or require it. After the methods from the Exporter are imported, this module then defines which of its internal methods and variables will be exported to other modules by default via the @EXPORT or, on request, via the @EXPORT_OK array.

NOTE:

The process of importing, exporting, autoloading, dynaloading, and bootstrapping in external modules is complex. You must understand a number of implicit and explicit methods and techniques to use these processes in the development of new modules. Because the focus in this book is on using existing modules, and not on development of new modules, I won't attempt to explain these processes in detail. See the Module List, PERLMOD, Exporter, AutoLoader, and DynaLoader for an in-depth discussion. See any CPAN site for the latest version of the module list.

Perl Modules: Usage and Invocation Syntax If a particular method is exported from a module, it means that when you use the module in your Perl program, the method will be accessible in the main:: package namespace, just like an ordinary Perl function, or a subroutine that lives in main::. If you declare the use() statement with the preceding Foo module, for example, and because it exports a single method, FooLoad() through @EXPORT array, you could write a short Perl program like Listing 2.2.

Listing 2.2. Using the simple extension module.

use Foo;

$loaded = FooLoad("FooFile");

Foo::Get($loaded);

Foo::Print($loaded);

Note how you are able to invoke the FooLoad() routine directly, but the Print() and Get() routines have to be invoked through the fully qualified name. Exporting all the methods from a module is enticing, but the practice is generally discouraged. A module that exports its methods into your program's namespace by default should have a valid reason for doing so. The module that does so takes the chance that one or more of its method or variable names will collide with a name that you are already using, potentially causing confusion and debugging headaches. In the preceding Foo module, the generically named Get() and Print() routines could easily cause such problems. Thus, they're exported through the @EXPORT_OK array and are loaded into the main:: namespace only upon request or invoked as shown via the full name. The general recommendation for using Perl modules is for you to access their methods through the full name and import nothing by default, unless you're going to subclass the module.

An even better way, in my opinion, is to use a blessed reference to the package to invoke the method. Of course, this assumes that some sort of new() method has been defined or inherited in the package to return a blessed reference to the package, or that you've explicitly blessed a reference into the package. Because the Foo module doesn't have a new() method, you'll need to use the latter technique, as shown in Listing 2.3.

Listing 2.3. Using a blessed reference to invoke methods from the Foo module.

use Foo;



my $packageref = {};  # must be a reference

bless $packageref, Foo; # bless $packageref into the foo package directly

$loaded = FooLoad("FooFile");

$newFoo = $packageref->Get($loaded);

$packageref->Print($newFoo);

This syntax just looks and feels cleaner, in my opinion. But there's more than one way to do it, as always, with Perl.

That wraps up the discussion regarding design and intent of modules and extensions. The following sections deal with the type of module that provides its own subroutines and doesn't load a shared library, which I call a regular module. Most of the WWW modules are of this type. Using Regular Modules Now we return to the Customer package from earlier in this chapter, where we were already doing most of the things discussed up to this point. Add a short POD for brevity, and one additional method, called addstat(), to wind up with what's shown in Listing 2.4.

Listing 2.4. The Customer module.

package Customer;



=head1 NAME

Customer - Perl module which implements a customer

object.



=head1 SYNOPSIS



    use Customer

    $cust = New Customer(args);

    $cust->dumpcust();

    $cust->addstat();



=head1 DESCRIPTION



This module may be used to create a set of customer

objects, for use within a program which needs an

implementation of this sort.



=head1 AUTHOR



Bill Middleton <wmiddlet@adobe.com>



=cut



sub new {

my $type = shift;

my %args = @_;

my $self = {};





$self->{`Name'} =

    length($args{`Name'}) ? $args{`Name'} : "No name given";

$self->{`Vitals'} =

    defined(@{$args{`Vitals'}}) ? $args{`Vitals'} : ["No vitals"];

bless $self, $type;

}



sub addstat{

my $self = shift;

my $key = shift;

my $val = shift;

    $self->{$key} = $val;

}





sub dumpcust{

my $self = shift;



# Print out the values for the object

print $self->{Name},"\n";

print $self->{Phone},"\n";

print join(` `,@{$self->{Vitals}}),"\n";

}

1;

Note that this module now provides the explicit means to obtain a reference to itself, by providing the new() method. As previously demonstrated, the new() method usually initializes a reference with some values for the customer's name and vitals, based on the arguments passed in. These variables are also known as instance variables, because they exist for each new object that is created. Because you expect any access to this package to be through the blessed reference, you're not going to export any methods.

Now you can use the blessed reference variable to invoke the methods that are defined in the package safely, with no worries about namespace collisions. You have the added benefit of automatically passing in the instance variables that have been declared within the reference variable, or object, that was returned by the new() method. (Note that you'll need to have the Customer.pm module, from Listing 2.4, somewhere in @INC, or pasted into Listing 2.5, for it to work.)

Listing 2.5. Using the Customer module.

use Customer;# Create a new customer object

$cust  = Customer->new( `Name' => "Billy T. Kid",

        `Vitals' => ["Age : 10", "Sex : M"] );



# add a single statistic

$cust->addstat(`Phone' => `1-203-456-7890');



# Print out the values for the object

$cust->dumpcust();

You will probably end up using a combination of the techniques I've discussed to work with modules within your Perl programs and obtain the level of functionality that you want. Each and every module that you use in your Perl programs will be a little bit different, certainly, but once you understand how you can use them, you're on your way to making use of any one of them.

Object-Oriented Techniques

I've already discussed some of the object-oriented functionality in Perl5, but I've intentionally ignored some details until now, in order to present them in a single section. Let's take a look at this functionality now in depth. Motivation After you become familiar with a module and its functionality and use, you may find yourself wishing for this feature or a method which is not implemented within the module. Say, for instance, that you obtain a module that provides you with a new() method to create an X object and perform operations on the X object such as Y() and Z(). Now suppose that you need to also be able to perform the A() operation on the X object. What would you do?

In the Perl4 days, you had two options. The first, and easiest, would have been to write to the author of the library or program, and ask him or her to add a subroutine to perform A() on the internal variable, X, for the next release. The author, assuming he or she had the same e-mail address published in the library or its README file and, more importantly, had the time and the desire to maintain and upgrade the code, would then consider your request and hopefully answer you with an affirmative reply. You would then wait patiently for the next release.

The other option would have been to copy the library or program, and then develop your own (private) additions and enhancements to it so that it would provide the A() subroutine, which you needed. Then, being a good Netizen, you would submit your changes back to the author, in the form of a patch, so that the functionality you added could (when tested and verified) be added to a later release. If the author decided that your new functionality wasn't appropriate for the master release and did not include it, then you were faced with the ongoing, repetitive task of obtaining the latest release and then figuring out how to fold in the new features and bug fixes from it into your own customized version. Clearly, this process can become a bit messy and time consuming as time goes on. Inheritance Nowadays, you can largely eliminate this dilemma by using the object-oriented features that you get with Perl5. If you need the functionality of a module but want to augment or even override its methods, doing so is a relatively simple matter. After you implement these additional features, submitting them to the author is still good Netizenship, but if he or she decides not to implement them in a future release, you're still in good shape. You simply install the next release of the augmented module, and your enhancements still work. You don't need to maintain separate copies, either.

Suppose you're using the Customer module, and everything is going fine. Now, say you want to add a method that would operate on the Customer object to provide a means to search for and return a particular aspect about a customer. You would then write a new module, perhaps called MyCust, and import all the functionality and methods from the Customer module. Then you would need to add only your method, search() to MyCust, to have all the functionality of Customer, including your search() method, as in Listing 2.6 (Note: you'll need Listing 2.4 again).

Listing 2.6. Subclassing the Customer module.

package MyCust;

require Customer; # Customer.pm must be in @INC

@ISA = qw(Customer);



sub search{

my $self = shift;

my $key = shift;

defined($self->{$key}) ? return $self->{$key} : undef;

}

1;

Note how you use the @ISA array to imply that this class is a subclass of the Customer class. The @ISA array tells Perl that this package is an instance of the elements it contains, which usually correspond to other packages. In this case, MyCust is an instance of Customer. The only code you have to write is the search() method, and when the next release of Customer comes out, you are (theoretically) unaffected by any internal changes that have been made by the author.

NOTE:

Ordinarily, changes to the base class should not affect a subclass, unless the author changes the names of the methods or the structure of the object. Such changes are generally not a wise practice, once the class is considered stable and "out of alpha."

Now you can code your program to use the MyCust class instead of the base Customer class and have the additional functionality of the search() method, as shown in Listing 2.7 (this listing requires both Listing 2.4 and Listing 2.6):

Listing 2.7. Using the MyCust subclass.

use MyCust; # requires both Customer.pm and MyCust.pm to be @INC



# Create a new customer object

$cust  = MyCust->new( `Name' => "Billy T. Kid",

        `Vitals' => ["Age : 10", "Sex : M"] );



# add a single statistic

$cust->addstat(`Phone' => `1-203-456-7890');



# search for and return the phone number

print "Customer's phone number is ",$cust->search(`Phone'),"\n";



# prints:

Customer's phone number is 1-203-456-7890

Simple, fast, easy, and fun. In Listing 2.7, you've just used the object-oriented programming technique called inheritance. Hopefully, the usefulness and simplicity of object-oriented programming techniques are now easier to understand and believe in, if you haven't seen them before. Method Override and Method Augmentation Another powerful object-oriented technique is known as method override. Suppose that instead of needing to add an additional method to Customer, you need for an existing method in the module to behave in a different way. Well, because I've discussed the alternatives already, I'll assume that you want to implement the new behavior in the object-oriented way. The technique is very much like the preceding inheritance example, but this time you name your method the same as the method that you want to behave differently.

The dumpcust method within Customer, for instance, is not very useful if you add additional fields to the customer object using addstat. You can override it with something a bit more generic, as shown in Listing 2.8.

Listing 2.8. Overriding Customers methods.

package MyCust;

require Customer; # requires Customer.pm from Listing 2.4

@ISA = qw(Customer);



sub search{

my $self = shift;

my $key = shift;

defined($self->{$key}) ? return $self->{$key} : undef;

}



sub dumpcust{

my $self = shift;

# Print out _all_ the values for the object

foreach $key (keys %{$self}){

    if($key eq `Vitals'){

        print join(` `,@{$self->{$key}}),"\n";

    }

    else{

        print $self->{$key},"\n";

    }

}

}

1;

Now you can invoke the dumpcust() method within the MyCust class and get the behavior you want, having all the fields print. (Note that Listing 2.9 requires the code from both Listing 2.4 and 2.8.)

Listing 2.9. Using the overridden method in MyCust.

use MyCust; # Requires the code from Customer.pm to be in @INC



# Create a new customer object

$cust  = MyCust->new( `Name' => "Billy T. Kid",

        `Vitals' => ["Age : 10", "Sex : M"] );



# add a single statistic

$cust->addstat(`Phone' => `1-203-456-7890');



# add a single statistic

$cust->addstat(`Zip' => `12345');



# search for and return the phone number

print "Customer's phone number is ",$cust->search(`Phone'),"\n";



# dump all values for customer, including the new Zip field

$cust->dumpcust();



# prints:

Customer's phone number is 1-203-456-7890

Billy T. Kid

1-203-456-7890

12345

Age : 10 Sex : M

Now, instead of a completely different behavior in a method, you may merely wish to augment its existing behavior. You accomplish this technique in a like manner, with the exception that you need to write only the code that provides your enhancement, and then invoke the parent method.

When you call your augmenting method, the first thing to do is call the original method from the parent class, passing in the object. Then you perform the additional operations you need.

Here we augment the dumpcust() method in MyCust to examine this technique. (Note again that Listing 2.10 requires Customer.pm, from Listing 2.4, to be in @INC, to work.)

Listing 2.10. Augmenting base class methods.

package MyCust;  # Requires Customer.pm in @INC

use Customer;

@ISA = qw(Customer);



sub search{

my $self = shift;

my $key = shift;

defined($self->{$key}) ? return $self->{$key} : undef;

}



sub dumpcust{

my $self = shift;

# Print out _all_ the values for the object

Customer::dumpcust($self);

foreach $key (keys %{$self}){

    next if($key =~ /Vitals|Name/);

    print $self->{$key},"\n";

}

}

Notice how in Listing 2.10 you first call the parent's method, Customer::dumpcust(), passing in the object, $self. Also notice that you invoke the parent's method using the Class::method() syntax. You couldn't use the $self->method syntax and rely on the object getting passed automatically, because $self is blessed into the MyCust class. You would have put yourself into an infinite loop! Always stay mindful of what class an object belongs to when using inheritance.

xNOTE:

We could have also used the UNIVERSAL class, or the SUPER keyword to implement method-augmentation, but we're not going to discuss those techniques here. They became available after the original authorship of this chapter, and may best be discussed in a more complete coverage of Perl5 OO techniques.

Now, the output from running Listing 2.9, and invoking the dumpcust() method therein, is ordered first according to the preference of the parent class, Customer, followed by the variables you've added to the customer object:

Customer's phone number is 1-203-456-7890

Billy T. Kid

Age : 10 Sex : M

1-203-456-7890

12345

That wraps up the discussion of object-oriented Perl programming. You've now seen the techniques used to implement a clean and fine-tuned level of functionality in your programs that use Perl5 and modules. I haven't covered all of the other object-oriented techniques and tricks, by any means, so I recommend that you investigate PERLBOT and PERLOBJ for more examples and in-depth explanations.

Practical Issues for Using Modules: Downloading and Installation

At this point, you've seen a few of the techniques for Perl programming using modules, references, and objects. Now how do you actually start using them? This process can be a simple
matter of typing

Perl makefile.PL

and then

make

from the shell prompt, if you're running your Web server on a UNIX machine.

Installing modules can also be a rather difficult process, depending on which module you want to use and whether you're running on a supported architecture for module installation, and the desired module. If you're running on UNIX, then you need write permission to the library directory(s), corresponding to @INC, as well. For some installations, you also must have the requisite tools, such as a C compiler, make, and a linker which creates shared libraries, as mentioned previously.
Although some of the more important WWW modules are written to be architecture independent, you usually must take some configuration steps to use a typical module on any platform but UNIX. Again, the cross-platform issues are under constant consideration among the folks who bring you Perl.

Traps for Perl4 Programmers Migrating to Perl5

The introduction of all the new features and power in Perl5 has yielded some incompatibilities between Perl4 and Perl5. Most well-written Perl4 programs do run under Perl5 and produce the same results as they did under Perl4. A number of specific examples of Perl4 code, however, may not do what you expect them to do when you run them under Perl5. See PERLTRAP for a description of many of these examples. The latest version of PERLTRAP, as of the release of this book, is also included in Appendix A.

In general, you need to inspect your older Perl4 programs against the examples in PERLTRAP to be sure that they don't include any of the items that may produce unexpected results. If they fail to compile, you're in good shape, but watch out for traps that do compile. Also, using the -w switch to Perl is always advisable when you're running any script.

Summary

If you've completed this chapter, congratulations! You've now been introduced to the latest and greatest version of Perl, and most of its powerful new features. You've also been exposed to some relatively simple examples which utilize these new features.

In general, the most important concepts you should've gleaned from reading this chapter are

A general understanding of the new Perl5 features.
A good grasp of the various uses for references.
An understanding of how the Perl5 module works, and its variations.
What a Perl5 object is, and how it is used.
How to invoke a module's methods directly, and through the use of blessed references.
The difference between require() and use().
The differences and similarities between an imported method via the use() statement and an inherited method via the @ISA array.
How to subclass a given Perl5 module, and how to override/augment a parent class's methods.

If you're satisfied that you understand these items, then you're now ready to explore the rest of the chapters and examples in this book. Where appropriate, we may point you back to this chapter to refresh your memory as you go along.

Hopefully, you've also taken the time to explore the external Perl5 POD documentation along the way and learn how it works. You'll be glad you did.

PERL	Perl overview
PERLTOC	Perl documentation table of contents
PERLDATA	Perl data structures
PERLSYN	Perl syntax
PERLOP	Perl operators and precedence
PERLRE	Perl regular expressions
PERLRUN	Perl execution and options
PERLFUNC	Perl built-in functions
PERLVAR	Perl predefined variables
PERLSUB	Perl subroutines
PERLMOD	Perl modules
PERLFORM	Formats, and using write()
PERLREF	Perl references
PERLDSC	Perl data structures intro
PERLLOL	Perl data structures: lists of lists
PERLOBJ	Perl objects
PERLTIE	Perl objects hidden behind simple variables
PERLBOT	Perl OO tricks and examples
PERLIPC	Perl interprocess communication
PERLDEBUG	Perl debugging
PERLDIAG	Perl diagnostic messages
PERLSEC	Perl security
PERLTRAP	Perl traps for the unwary
PERLSTYLE	Perl style guide
PERLXS	Perl XS application programming interface
PERLXSTUT	Perl XS tutorial
PERLGUTS	Perl internal functions for creating extensions
PERLCALL	Perl calling conventions from C
PERLEMBED	Perl: how to embed Perl in your C or C++ application
PERLPOD	Perl plain old documentation
PERLAPIO	Perl internal IO abstraction interface
PERLBOOK	Perl book information

Data Type	Assignment Syntax	Dereference Syntax
Scalar	$ref = \$var;	$$ref
Scalar Array	$ref = \@array	@{$ref} or ${$ref}[0] for individual elements
Hash	$ref = \%array	%{$ref} or ${$ref}{key} for
individual elements
Reference	$refref = \$ref	$$$ref
Subroutine (CODE)	$ref = \&sub	&$sub
Package	bless $ref, Package	$ref->method() $ref->variable

2 A Perl5 Overview and Tutorial

NOTE:

NOTE:

New Features in Perl5

Usability and Simplicity

Lexical Scoping

References

Data Structures

Modules and Libraries

TIP:

Reusability

Object-Oriented Capabilities

Extensible and Embeddable

NOTE:

Souped-Up Regular Expressions

Enhanced Safety and Security

Other New Features in Perl5

Summary of the New Perl5 Features

Extended Perl5 Tutorial

References

CAUTION:

An In-Depth Look at Perl5 Modules

The Short History of Perl Extensions

Modules and Extensions: Purpose and Design

NOTE:

The Perl5 Module: Form and Function

Listing 2.1. A simple extension module.

NOTE:

Listing 2.2. Using the simple extension module.

Listing 2.3. Using a blessed reference to invoke methods from the Foo module.

Listing 2.4. The Customer module.

Listing 2.5. Using the Customer module.

Object-Oriented Techniques

Listing 2.6. Subclassing the Customer module.

NOTE:

Listing 2.7. Using the MyCust subclass.

Listing 2.8. Overriding Customers methods.

Listing 2.9. Using the overridden method in MyCust.

Listing 2.10. Augmenting base class methods.

xNOTE:

Practical Issues for Using Modules: Downloading and Installation

Traps for Perl4 Programmers Migrating to Perl5

Summary

2
A Perl5 Overview and Tutorial