NAME
perldelta - what is new for perl 5.10.0
DESCRIPTION
This document describes the differences between the 5.8.8 release and the 5.10.0 release.
Many of the bug fixes in 5.10.0 were already seen in the 5.8.X maintenance releases; they are not duplicated here and are documented in the set of man pages named perl58[1-8]?delta.
Core Enhancements
The \f(CWfeature\fP pragma
The
featurepragma is used to enable new syntax that would break Perls backwards-compatibility with older releases of the language. Its a lexical pragma, like
strictor
warnings.
Currently the following new features are available:
switch(adds a switch statement),
say(adds a
saybuilt-in function), and
state(adds a
statekeyword for declaring static variables). Those features are described in their own sections of this document.
The
featurepragma is also implicitly loaded when you require a minimal perl version (with the
use VERSIONconstruct) greater than, or equal to, 5.9.5. See feature for details.
New \fB\-E\fP command-line switch
-E is equivalent to -e, but it implicitly enables all optional features (like
use feature ":5.10").
Defined-or operator
A new operator
//(defined-or) has been implemented. The following expression:
$a // $b
is merely equivalent to
defined $a ? $a : $b
and the statement
$c //= $d;
can now be used instead of
$c = $d unless defined $c;
The
//operator has the same precedence and associativity as
||. Special care has been taken to ensure that this operator Do What You Mean while not breaking old code, but some edge cases involving the empty regular expression may now parse differently. See perlop for details.
Switch and Smart Match operator
Perl 5 now has a switch statement. Its available when
use feature switchis in effect. This feature introduces three new keywords,
given,
when, and
default:
given ($foo) {
when (/^abc/) { $abc = 1; }
when (/^def/) { $def = 1; }
when (/^xyz/) { $xyz = 1; }
default { $nothing = 1; }
}
A more complete description of how Perl matches the switch variable against the
whenconditions is given in Switch statements in perlsyn.
This kind of match is called smart match, and its also possible to use it outside of switch statements, via the new
~~operator. See Smart matching in detail in perlsyn.
This feature was contributed by Robin Houston.
Regular expressions
Recursive Patterns |
It is now possible to write recursive patterns without using the (??{})construct. This new way is more efficient, and in many cases easier to read. Each capturing parenthesis can now be treated as an independent pattern that can be entered by using the (?PARNO)syntax ( PARNOstanding for parenthesis number). For example, the following pattern will match nested balanced angle brackets:
/
^ # start of line
( # start capture buffer 1
< # match an opening angle bracket
(?: # match one of:
(?> # dont backtrack over the inside of this group
[^<>]+ # one or more non angle brackets
) # end non backtracking group
| # ... or ...
(?1) # recurse to bracket 1 and try it again
)* # 0 or more times.
> # match a closing angle bracket
) # end capture buffer one
$ # end of line
/x
PCRE users should note that Perls recursive regex feature allows backtracking into a recursed pattern, whereas in PCRE the recursion is atomic or possessive in nature. As in the example above, you can add (?>) to control this selectively. (Yves Orton) |
Named Capture Buffers |
It is now possible to name capturing parenthesis in a pattern and refer to
the captured contents by name. The naming syntax is (?<NAME>....). Its possible to backreference to a named buffer with the \k<NAME>syntax. In code, the new magical hashes %+and %-can be used to access the contents of the capture buffers. Thus, to replace all doubled chars with a single copy, one could write
s/(?<letter>.)\k<letter>/$+{letter}/g
Only buffers with defined contents will be visible in the %+hash, so its possible to do something like
foreach my $name (keys %+) {
print "content of buffer $name is $+{$name}\n";
}
The %-hash is a bit more complete, since it will contain array refs holding values from all capture buffers similarly named, if there should be many of them.
%+and %-are implemented as tied hashes through the new module Tie::Hash::NamedCapture. Users exposed to the .NET regex engine will find that the perl implementation differs in that the numerical ordering of the buffers is sequential, and not unnamed first, then named. Thus in the pattern
/(A)(?<B>B)(C)(?<D>D)/
$1will be A, $2will be B, $3will be C and $4will be D and not $1is A, $2is C and $3is B and $4is D that a .NET programmer would expect. This is considered a feature. :-) (Yves Orton) |
Possessive Quantifiers |
Perl now supports the possessive quantifier syntax of the atomic match
pattern. Basically a possessive quantifier matches as much as it can and never
gives any back. Thus it can be used to control backtracking. The syntax is
similar to non-greedy matching, except instead of using a ? as the modifier
the + is used. Thus ?+, *+, ++, {min,max}+are now legal quantifiers. (Yves Orton) |
Backtracking control verbs | The regex engine now supports a number of special-purpose backtrack control verbs: (*THEN), (*PRUNE), (*MARK), (*SKIP), (*COMMIT), (*FAIL) and (*ACCEPT). See perlre for their descriptions. (Yves Orton) |
Relative backreferences |
A new syntax \g{N}or \gNwhere N is a decimal integer allows a safer form of back-reference notation as well as allowing relative backreferences. This should make it easier to generate and embed patterns that contain backreferences. See Capture buffers in perlre. (Yves Orton) |
\Kescape |
The functionality of Jeff Pinyans module Regexp::Keep has been added to
the core. In regular expressions you can now use the special escape \Kas a way to do something like floating length positive lookbehind. It is also useful in substitutions like:
s/(foo)bar/$1/g
that can now be converted to
s/foo\Kbar//g
which is much more efficient. (Yves Orton) |
Vertical and horizontal whitespace, and linebreak |
Regular expressions now recognize the \vand \hescapes that match vertical and horizontal whitespace, respectively. \Vand \Hlogically match their complements.
\Rmatches a generic linebreak, that is, vertical whitespace, plus the multi-character sequence "\x0D\x0A". |
\f(CWsay()\fP
say() is a new built-in, only available when
use feature sayis in effect, that is similar to print(), but that implicitly appends a newline to the printed string. See say in perlfunc. (Robin Houston)
Lexical \f(CW$_\fP
The default variable
$_can now be lexicalized, by declaring it like any other lexical variable, with a simple
my $_;
The operations that default on
$_will use the lexically-scoped version of
$_when it exists, instead of the global
$_.
In a
mapor a
grepblock, if
$_was previously myed, then the
$_inside the block is lexical as well (and scoped to the block).
In a scope where
$_has been lexicalized, you can still have access to the global version of
$_by using
$::_, or, more simply, by overriding the lexical declaration with
our $_. (Rafael Garcia-Suarez)
The \f(CW_\fP prototype
A new prototype character has been added.
_is equivalent to
$but defaults to
$_if the corresponding argument isnt supplied. (both
$and
_denote a scalar). Due to the optional nature of the argument, you can only use it at the end of a prototype, or before a semicolon.
This has a small incompatible consequence: the prototype() function has been adjusted to return
_for some built-ins in appropriate cases (for example,
prototype(CORE::rmdir)). (Rafael Garcia-Suarez)
\s-1UNITCHECK\s0 blocks
UNITCHECK, a new special code block has been introduced, in addition to
BEGIN,
CHECK,
INITand
END.
CHECKand
INITblocks, while useful for some specialized purposes, are always executed at the transition between the compilation and the execution of the main program, and thus are useless whenever code is loaded at runtime. On the other hand,
UNITCHECKblocks are executed just after the unit which defined them has been compiled. See perlmod for more information. (Alex Gough)
New Pragma, \f(CWmro\fP
A new pragma,
mro(for Method Resolution Order) has been added. It permits to switch, on a per-class basis, the algorithm that perl uses to find inherited methods in case of a multiple inheritance hierarchy. The default MRO hasnt changed (DFS, for Depth First Search). Another MRO is available: the C3 algorithm. See mro for more information. (Brandon Black)
Note that, due to changes in the implementation of class hierarchy search, code that used to undef the
*ISAglob will most probably break. Anyway, undefing
*ISAhad the side-effect of removing the magic on the
@ISAarray and should not have been done in the first place. Also, the cache
*::ISA::CACHE::no longer exists; to force reset the
@ISAcache, you now need to use the
mroAPI, or more simply to assign to
@ISA(e.g. with
@ISA = @ISA).
\fIreaddir()\fP may return a ``short filename'' on Windows
The readdir() function may return a short filename when the long filename contains characters outside the ANSI codepage. Similarly Cwd::cwd() may return a short directory name, and glob() may return short names as well. On the NTFS file system these short names can always be represented in the ANSI codepage. This will not be true for all other file system drivers; e.g. the FAT filesystem stores short filenames in the OEM codepage, so some files on FAT volumes remain unaccessible through the ANSI APIs.
Similarly, $^X,
@INC, and
$ENV{PATH} are preprocessed at startup to make sure all paths are valid in the ANSI codepage (if possible).
The Win32::GetLongPathName() function now returns the UTF-8 encoded correct long file name instead of using replacement characters to force the name into the ANSI codepage. The new Win32::GetANSIPathName() function can be used to turn a long pathname into a short one only if the long one cannot be represented in the ANSI codepage.
Many other functions in the
Win32module have been improved to accept UTF-8 encoded arguments. Please see Win32 for details.
\fIreadpipe()\fP is now overridable
The built-in function readpipe() is now overridable. Overriding it permits also to override its operator counterpart,
qx//(a.k.a.
``). Moreover, it now defaults to
$_if no argument is provided. (Rafael Garcia-Suarez)
Default argument for \fIreadline()\fP
readline() now defaults to
*ARGVif no argument is provided. (Rafael Garcia-Suarez)
\fIstate()\fP variables
A new class of variables has been introduced. State variables are similar to
myvariables, but are declared with the
statekeyword in place of
my. Theyre visible only in their lexical scope, but their value is persistent: unlike
myvariables, theyre not undefined at scope entry, but retain their previous value. (Rafael Garcia-Suarez, Nicholas Clark)
To use state variables, one needs to enable them by using
use feature state;
or by using the
-Ecommand-line switch in one-liners. See Persistent variables via state() in perlsub.
Stacked filetest operators
As a new form of syntactic sugar, its now possible to stack up filetest operators. You can now write
-f -w -x $filein a row to mean
-x $file && -w _ && -f _. See -X in perlfunc.
\s-1\fIUNIVERSAL::DOES\s0()\fP
The
UNIVERSALclass has a new method,
DOES(). It has been added to solve semantic problems with the
isa()method.
isa()checks for inheritance, while
DOES()has been designed to be overridden when module authors use other types of relations between classes (in addition to inheritance). (chromatic)
See $obj->DOES( ROLE ) in UNIVERSAL.
Formats
Formats were improved in several ways. A new field,
^*, can be used for variable-width, one-line-at-a-time text. Null characters are now handled correctly in picture lines. Using
@#and
~~together will now produce a compile-time error, as those format fields are incompatible. perlform has been improved, and miscellaneous bugs fixed.
Byte-order modifiers for \fIpack()\fP and \fIunpack()\fP
There are two new byte-order modifiers,
>(big-endian) and
<(little-endian), that can be appended to most pack() and unpack() template characters and groups to force a certain byte-order for that type or group. See pack in perlfunc and perlpacktut for details.
\f(CWno VERSION\fP
You can now use
nofollowed by a version number to specify that you want to use a version of perl older than the specified one.
\f(CWchdir\fP, \f(CWchmod\fP and \f(CWchown\fP on filehandles
chdir,
chmodand
chowncan now work on filehandles as well as filenames, if the system supports respectively
fchdir,
fchmodand
fchown, thanks to a patch provided by Gisle Aas.
\s-1OS\s0 groups
$(and
$)now return groups in the order where the OS returns them, thanks to Gisle Aas. This wasnt previously the case.
Recursive sort subs
You can now use recursive subroutines with sort(), thanks to Robin Houston.
Exceptions in constant folding
The constant folding routine is now wrapped in an exception handler, and if folding throws an exception (such as attempting to evaluate 0/0), perl now retains the current optree, rather than aborting the whole program. Without this change, programs would not compile if they had expressions that happened to generate exceptions, even though those expressions were in code that could never be reached at runtime. (Nicholas Clark, Dave Mitchell)
Source filters in \f(CW@INC\fP
Its possible to enhance the mechanism of subroutine hooks in
@INCby adding a source filter on top of the filehandle opened and returned by the hook. This feature was planned a long time ago, but wasnt quite working until now. See require in perlfunc for details. (Nicholas Clark)
New internal variables
${^RE_DEBUG_FLAGS} |
This variable controls what debug flags are in effect for the regular
expression engine when running under use re "debug". See re for details. |
${^CHILD_ERROR_NATIVE} | This variable gives the native status returned by the last pipe close, backtick command, successful call to wait() or waitpid(), or from the system() operator. See perlrun for details. (Contributed by Gisle Aas.) |
${^RE_TRIE_MAXBUF} | See Trie optimisation of literal string alternations. |
${^WIN32_SLOPPY_STAT} | See Sloppy stat on Windows. |
Miscellaneous
unpack()now defaults to unpacking the
$_variable.
mkdir()without arguments now defaults to
$_.
The internal dump output has been improved, so that non-printable characters such as newline and backspace are output in
\xnotation, rather than octal.
The -C option can no longer be used on the
#!line. It wasnt working there anyway, since the standard streams are already set up at this point in the execution of the perl interpreter. You can use binmode() instead to get the desired behaviour.
\s-1UCD\s0 5.0.0
The copy of the Unicode Character Database included in Perl 5 has been updated to version 5.0.0.
\s-1MAD\s0
MAD, which stands for Miscellaneous Attribute Decoration, is a still-in-development work leading to a Perl 5 to Perl 6 converter. To enable it, its necessary to pass the argument
-Dmadto Configure. The obtained perl isnt binary compatible with a regular perl 5.10, and has space and speed penalties; moreover not all regression tests still pass with it. (Larry Wall, Nicholas Clark)
\fIkill()\fP on Windows
On Windows platforms,
kill(-9, $pid)now kills a process tree. (On UNIX, this delivers the signal to all processes in the same process group.)
Incompatible Changes
Packing and \s-1UTF\-8\s0 strings
The semantics of pack() and unpack() regarding UTF-8-encoded data has been changed. Processing is now by default character per character instead of byte per byte on the underlying encoding. Notably, code that used things like
pack("a*", $string)to see through the encoding of string will now simply get back the original
$string. Packed strings can also get upgraded during processing when you store upgraded characters. You can get the old behaviour by using
use bytes.
To be consistent with pack(), the
C0in unpack() templates indicates that the data is to be processed in character mode, i.e. character by character; on the contrary,
U0in unpack() indicates UTF-8 mode, where the packed string is processed in its UTF-8-encoded Unicode form on a byte by byte basis. This is reversed with regard to perl 5.8.X, but now consistent between pack() and unpack().
Moreover,
C0and
U0can also be used in pack() templates to specify respectively character and byte modes.
C0and
U0in the middle of a pack or unpack format now switch to the specified encoding mode, honoring parens grouping. Previously, parens were ignored.
Also, there is a new pack() character format,
W, which is intended to replace the old
C.
Cis kept for unsigned chars coded as bytes in the strings internal representation.
Wrepresents unsigned (logical) character values, which can be greater than 255. It is therefore more robust when dealing with potentially UTF-8-encoded data (as
Cwill wrap values outside the range 0..255, and not respect the string encoding).
In practice, that means that pack formats are now encoding-neutral, except
C.
For consistency,
Ain unpack() format now trims all Unicode whitespace from the end of the string. Before perl 5.9.2, it used to strip only the classical ASCII space characters.
Byte/character count feature in \fIunpack()\fP
A new unpack() template character,
".", returns the number of bytes or characters (depending on the selected encoding mode, see above) read so far.
The \f(CW$*\fP and \f(CW$#\fP variables have been removed
$*, which was deprecated in favor of the
/sand
/mregexp modifiers, has been removed.
The deprecated
$#variable (output format for numbers) has been removed.
Two new severe warnings,
$#/$* is no longer supported, have been added.
\fIsubstr()\fP lvalues are no longer fixed-length
The lvalues returned by the three argument form of substr() used to be a fixed length window on the original string. In some cases this could cause surprising action at distance or other undefined behaviour. Now the length of the window adjusts itself to the length of the string assigned to it.
Parsing of \f(CW\-f _\fP
The identifier
_is now forced to be a bareword after a filetest operator. This solves a number of misparsing issues when a global
_subroutine is defined.
\f(CW:unique\fP
The
:uniqueattribute has been made a no-op, since its current implementation was fundamentally flawed and not threadsafe.
Effect of pragmas in eval
The compile-time value of the
%^Hhint variable can now propagate into eval("")uated code. This makes it more useful to implement lexical pragmas.
As a side-effect of this, the overloaded-ness of constants now propagates into eval("").
chdir \s-1FOO\s0
A bareword argument to chdir() is now recognized as a file handle. Earlier releases interpreted the bareword as a directory name. (Gisle Aas)
Handling of .pmc files
An old feature of perl was that before
requireor
uselook for a file with a .pm extension, they will first look for a similar filename with a .pmc extension. If this file is found, it will be loaded in place of any potentially existing file ending in a .pm extension.
Previously, .pmc files were loaded only if more recent than the matching .pm file. Starting with 5.9.4, theyll be always loaded if they exist.
$^V is now a \f(CWversion\fP object instead of a v\-string
$^V can still be used with the
%vdformat in printf, but any character-level operations will now access the string representation of the
versionobject and not the ordinals of a v-string. Expressions like
substr($^V, 0, 2)or
split //, $^Vno longer work and must be rewritten.
@\- and @+ in patterns
The special arrays
@-and
@+are no longer interpolated in regular expressions. (Sadahiro Tomoyuki)
\f(CW$AUTOLOAD\fP can now be tainted
If you call a subroutine by a tainted name, and if it defers to an AUTOLOAD function, then
$AUTOLOADwill be (correctly) tainted. (Rick Delaney)
Tainting and printf
When perl is run under taint mode,
printf()and
sprintf()will now reject any tainted format argument. (Rafael Garcia-Suarez)
undef and signal handlers
Undefining or deleting a signal handler via
undef $SIG{FOO}is now equivalent to setting it to
DEFAULT. (Rafael Garcia-Suarez)
strictures and dereferencing in \fIdefined()\fP
use strict refswas ignoring taking a hard reference in an argument to defined(), as in :
use strict refs;
my $x = foo;
if (defined $$x) {...}
This now correctly produces the run-time error
Cant use string as a SCALAR ref while "strict refs" in use.
defined @$fooand
defined %$barare now also subject to
strict refs(that is,
$fooand
$barshall be proper references there.) (
defined(@foo)and
defined(%bar)are discouraged constructs anyway.) (Nicholas Clark)
\f(CW(?p{})\fP has been removed
The regular expression construct
(?p{}), which was deprecated in perl 5.8, has been removed. Use
(??{})instead. (Rafael Garcia-Suarez)
Pseudo-hashes have been removed
Support for pseudo-hashes has been removed from Perl 5.9. (The
fieldspragma remains here, but uses an alternate implementation.)
Removal of the bytecode compiler and of perlcc
perlcc, the byteloader and the supporting modules (B::C, B::CC, B::Bytecode, etc.) are no longer distributed with the perl sources. Those experimental tools have never worked reliably, and, due to the lack of volunteers to keep them in line with the perl interpreter developments, it was decided to remove them instead of shipping a broken version of those. The last version of those modules can be found with perl 5.9.4.
However the B compiler framework stays supported in the perl core, as with the more useful modules it has permitted (among others, B::Deparse and B::Concise).
Removal of the \s-1JPL\s0
The JPL (Java-Perl Lingo) has been removed from the perl sources tarball.
Recursive inheritance detected earlier
Perl will now immediately throw an exception if you modify any packages
@ISAin such a way that it would cause recursive inheritance.
Previously, the exception would not occur until Perl attempted to make use of the recursive inheritance while resolving a method or doing a
$foo->isa($bar)lookup.
Modules and Pragmata
Upgrading individual core modules
Even more core modules are now also available separately through the CPAN. If you wish to update one of these modules, you dont need to wait for a new perl release. From within the cpan shell, running the r command will report on modules with upgrades available. See
perldoc CPANfor more information.
Pragmata Changes
feature |
The new pragma featureis used to enable new features that might break old code. See "The featurepragma" above. |
mro |
This new pragma enables to change the algorithm used to resolve inherited
methods. See "New Pragma, mro" above. |
Scoping of the sortpragma |
The sortpragma is now lexically scoped. Its effect used to be global. |
Scoping of bignum, bigint, bigrat |
The three numeric pragmas bignum, bigintand bigratare now lexically scoped. (Tels) |
base |
The basepragma now warns if a class tries to inherit from itself. (Curtis Ovid Poe) |
strictand warnings |
strictand warningswill now complain loudly if they are loaded via incorrect casing (as in use Strict;). (Johan Vromans) |
version |
The versionmodule provides support for version objects. |
warnings |
The warningspragma doesnt load Carpanymore. That means that code that used Carproutines without having loaded it at compile time might need to be adjusted; typically, the following (faulty) code wont work anymore, and will require parentheses to be added after the function name:
use warnings;
require Carp;
Carp::confess argh;
|
less |
lessnow does something useful (or at least it tries to). In fact, it has been turned into a lexical pragma. So, in your modules, you can now test whether your users have requested to use less CPU, or less memory, less magic, or maybe even less fat. See less for more. (Joshua ben Jore) |
New modules
o |
encoding::warnings, by Audrey Tang, is a module to emit warnings whenever an ASCII character string containing high-bit bytes is implicitly converted into UTF-8. Its a lexical pragma since Perl 5.9.4; on older perls, its effect is global. |
o |
Module::CoreList, by Richard Clamp, is a small handy module that tells you what versions of core modules ship with any versions of Perl 5. It comes with a command-line frontend, corelist. |
o |
Math::BigInt::FastCalcis an XS-enabled, and thus faster, version of Math::BigInt::Calc. |
o |
Compress::Zlibis an interface to the zlib compression library. It comes with a bundled version of zlib, so having a working zlib is not a prerequisite to install it. Its used by Archive::Tar(see below). |
o |
IO::Zlibis an IO::-style interface to Compress::Zlib. |
o |
Archive::Taris a module to manipulate tararchives. |
o |
Digest::SHAis a module used to calculate many types of SHA digests, has been included for SHA support in the CPAN module. |
o |
ExtUtils::CBuilderand ExtUtils::ParseXShave been added. |
o |
Hash::Util::FieldHash, by Anno Siegel, has been added. This module provides support for field hashes: hashes that maintain an association of a reference with a value, in a thread-safe garbage-collected way. Such hashes are useful to implement inside-out objects. |
o |
Module::Build, by Ken Williams, has been added. Its an alternative to ExtUtils::MakeMakerto build and install perl modules. |
o |
Module::Load, by Jos Boumans, has been added. It provides a single interface to load Perl modules and .pl files. |
o |
Module::Loaded, by Jos Boumans, has been added. Its used to mark modules as loaded or unloaded. |
o |
Package::Constants, by Jos Boumans, has been added. Its a simple helper to list all constants declared in a given package. |
o |
Win32API::File, by Tye McQueen, has been added (for Windows builds). This module provides low-level access to Win32 system API calls for files/dirs. |
o |
Locale::Maketext::Simple, needed by CPANPLUS, is a simple wrapper around Locale::Maketext::Lexicon. Note that Locale::Maketext::Lexiconisnt included in the perl core; the behaviour of Locale::Maketext::Simplegracefully degrades when the later isnt present. |
o |
Params::Checkimplements a generic input parsing/checking mechanism. It is used by CPANPLUS. |
o |
Term::UIsimplifies the task to ask questions at a terminal prompt. |
o |
Object::Accessorprovides an interface to create per-object accessors. |
o |
Module::Pluggableis a simple framework to create modules that accept pluggable sub-modules. |
o |
Module::Load::Conditionalprovides simple ways to query and possibly load installed modules. |
o |
Time::Pieceprovides an object oriented interface to time functions, overriding the built-ins localtime() and gmtime(). |
o |
IPC::Cmdhelps to find and run external commands, possibly interactively. |
o |
File::Fetchprovide a simple generic file fetching mechanism. |
o |
Log::Messageand Log::Message::Simpleare used by the log facility of CPANPLUS. |
o |
Archive::Extractis a generic archive extraction mechanism for .tar (plain, gziped or bzipped) or .zip files. |
o |
CPANPLUSprovides an API and a command-line tool to access the CPAN mirrors. |
o |
Pod::Escapesprovides utilities that are useful in decoding Pod E<...> sequences. |
o |
Pod::Simpleis now the backend for several of the Pod-related modules included with Perl. |
Selected Changes to Core Modules
Attribute::Handlers |
Attribute::Handlerscan now report the callers file and line number. (David Feldman) All interpreted attributes are now passed as array references. (Damian Conway) |
B::Lint |
B::Lintis now based on Module::Pluggable, and so can be extended with plugins. (Joshua ben Jore) |
B |
Its now possible to access the lexical pragma hints (%^H) by using the method B::COP::hints_hash(). It returns a B::RHEobject, which in turn can be used to get a hash reference via the method B::RHE::HASH(). (Joshua ben Jore) |
Thread |
As the old 5005thread threading model has been removed, in favor of the
ithreads scheme, the Threadmodule is now a compatibility wrapper, to be used in old code only. It has been removed from the default list of dynamic extensions. |
Utility Changes
perl -d |
The Perl debugger can now save all debugger commands for sourcing later;
notably, it can now emulate stepping backwards, by restarting and
rerunning all bar the last command from a saved command history.
It can also display the parent inheritance tree of a given class, with the icommand. |
ptar |
ptaris a pure perl implementation of tarthat comes with Archive::Tar. |
ptardiff |
ptardiffis a small utility used to generate a diff between the contents of a tar archive and a directory tree. Like ptar, it comes with Archive::Tar. |
shasum |
shasumis a command-line utility, used to print or to check SHA digests. It comes with the new Digest::SHAmodule. |
corelist |
The corelistutility is now installed with perl (see New modules above). |
h2ph and h2xs |
h2phand h2xshave been made more robust with regard to modern C code.
h2xsimplements a new option --use-xsloaderto force use of XSLoadereven in backwards compatible modules. The handling of authors names that had apostrophes has been fixed. Any enums with negative values are now skipped. |
perlivp |
perlivpno longer checks for *.ph files by default. Use the new -aoption to run all tests. |
find2perl |
find2perlnow assumes Several bugs have been fixed in find2perl, regarding -execand -eval. Also the options -path, -ipathand -inamehave been added. |
config_data |
config_datais a new utility that comes with Module::Build. It provides a command-line interface to the configuration of Perl modules that use Module::Builds framework of configurability (that is, *::ConfigDatamodules that contain local configuration information for their parent modules.) |
cpanp |
cpanp, the CPANPLUS shell, has been added. ( cpanp-run-perl, a helper for CPANPLUS operation, has been added too, but isnt intended for direct use). |
cpan2dist |
cpan2distis a new utility that comes with CPANPLUS. Its a tool to create distributions (or packages) from CPAN modules. |
pod2html |
The output of pod2htmlhas been enhanced to be more customizable via CSS. Some formatting problems were also corrected. (Jari Aalto) |
New Documentation
The perlpragma manpage documents how to write ones own lexical pragmas in pure Perl (something that is possible starting with 5.9.4).
The new perlglossary manpage is a glossary of terms used in the Perl documentation, technical and otherwise, kindly provided by OReilly Media, Inc.
The perlreguts manpage, courtesy of Yves Orton, describes internals of the Perl regular expression engine.
The perlreapi manpage describes the interface to the perl interpreter used to write pluggable regular expression engines (by Ævar Arnfjoer- Bjarmason).
The perlunitut manpage is an tutorial for programming with Unicode and string encodings in Perl, courtesy of Juerd Waalboer.
A new manual page, perlunifaq (the Perl Unicode FAQ), has been added (Juerd Waalboer).
The perlcommunity manpage gives a description of the Perl community on the Internet and in real life. (Edgar Trizor Bering)
The CORE manual page documents the
CORE::namespace. (Tels)
The long-existing feature of
/(?{...})/regexps setting
$_and pos() is now documented.
Performance Enhancements
In-place sorting
Sorting arrays in place (
@a = sort @a) is now optimized to avoid making a temporary copy of the array.
Likewise,
reverse sort ...is now optimized to sort in reverse, avoiding the generation of a temporary intermediate list.
Lexical array access
Access to elements of lexical arrays via a numeric constant between 0 and 255 is now faster. (This used to be only the case for global arrays.)
XS-assisted \s-1SWASHGET\s0
Some pure-perl code that perl was using to retrieve Unicode properties and transliteration mappings has been reimplemented in XS.
Constant subroutines
The interpreter internals now support a far more memory efficient form of inlineable constants. Storing a reference to a constant value in a symbol table is equivalent to a full typeglob referencing a constant subroutine, but using about 400 bytes less memory. This proxy constant subroutine is automatically upgraded to a real typeglob with subroutine if necessary. The approach taken is analogous to the existing space optimisation for subroutine stub declarations, which are stored as plain scalars in place of the full typeglob.
Several of the core modules have been converted to use this feature for their system dependent constants - as a result
use POSIX;now takes about 200K less memory.
\f(CWPERL_DONT_CREATE_GVSV\fP
The new compilation flag
PERL_DONT_CREATE_GVSV, introduced as an option in perl 5.8.8, is turned on by default in perl 5.9.3. It prevents perl from creating an empty scalar with every new typeglob. See perl588delta for details.
Weak references are cheaper
Weak reference creation is now O(1) rather than O(n), courtesy of Nicholas Clark. Weak reference deletion remains O(n), but if deletion only happens at program exit, it may be skipped completely.
\fIsort()\fP enhancements
Salvador Fandinõ provided improvements to reduce the memory usage of
sortand to speed up some cases.
Memory optimisations
Several internal data structures (typeglobs, GVs, CVs, formats) have been restructured to use less memory. (Nicholas Clark)
\s-1UTF\-8\s0 cache optimisation
The UTF-8 caching code is now more efficient, and used more often. (Nicholas Clark)
Sloppy stat on Windows
On Windows, perls stat() function normally opens the file to determine the link count and update attributes that may have been changed through hard links. Setting ${^WIN32_SLOPPY_STAT} to a true value speeds up stat() by not performing this operation. (Jan Dubois)
Regular expressions optimisations
Engine de-recursivised | The regular expression engine is no longer recursive, meaning that patterns that used to overflow the stack will either die with useful explanations, or run to completion, which, since they were able to blow the stack before, will likely take a very long time to happen. If you were experiencing the occasional stack overflow (or segfault) and upgrade to discover that now perl apparently hangs instead, look for a degenerate regex. (Dave Mitchell) |
Single char char-classes treated as literals | Classes of a single character are now treated the same as if the character had been used as a literal, meaning that code that uses char-classes as an escaping mechanism will see a speedup. (Yves Orton) |
Trie optimisation of literal string alternations |
Alternations, where possible, are optimised into more efficient matching
structures. String literal alternations are merged into a trie and are
matched simultaneously. This means that instead of O(N) time for matching
N alternations at a given point, the new code performs in O(1) time.
A new special variable, ${^RE_TRIE_MAXBUF}, has been added to fine-tune
this optimization. (Yves Orton)
Note: Much code exists that works around perls historic poor performance on alternations. Often the tricks used to do so will disable the new optimisations. Hopefully the utility modules used for this purpose will be educated about these new optimisations. |
Aho-Corasick start-point optimisation | When a pattern starts with a trie-able alternation and there arent better optimisations available, the regex engine will use Aho-Corasick matching to find the start point. (Yves Orton) |
Installation and Configuration Improvements
Configuration improvements
-Dusesitecustomize |
Run-time customization of @INCcan be enabled by passing the -Dusesitecustomizeflag to Configure. When enabled, this will make perl run $sitelibexp/sitecustomize.pl before anything else. This script can then be set up to add additional entries to @INC. |
Relocatable installations |
There is now Configure support for creating a relocatable perl tree. If
you Configure with -Duserelocatableinc, then the paths in @INC(and everything else in %Config) can be optionally located via the path of the perl executable. That means that, if the string ".../"is found at the start of any path, its substituted with the directory of $^X. So, the relocation can be configured on a per-directory basis, although the default with -Duserelocatableincis that everything is relocated. The initial install is done to the original configured prefix. |
strlcat() and strlcpy() | The configuration process now detects whether strlcat() and strlcpy() are available. When they are not available, perls own version is used (from Russ Allberys public domain implementation). Various places in the perl interpreter now use them. (Steve Peters) |
d_pseudoforkand d_printf_format_null |
A new configuration variable, available as $Config{d_pseudofork}in the Config module, has been added, to distinguish real fork() support from fake pseudofork used on Windows platforms. A new configuration variable, d_printf_format_null, has been added, to see if printf-like formats are allowed to be NULL. |
Configure help |
Configure -hhas been extended with the most commonly used options. |
Compilation improvements
Parallel build |
Parallel makes should work properly now, although there may still be problems
if make testis instructed to run in parallel. |
Borlands compilers support | Building with Borlands compilers on Win32 should work more smoothly. In particular Steve Hay has worked to side step many warnings emitted by their compilers and at least one C compiler internal error. |
Static build on Windows |
Perl extensions on Windows now can be statically built into the Perl DLL.
Also, its now possible to build a perl-static.exethat doesnt depend on the Perl DLL on Win32. See the Win32 makefiles for details. (Vadim Konovalov) |
ppport.h files | All ppport.h files in the XS modules bundled with perl are now autogenerated at build time. (Marcus Holland-Moritz) |
C++ compatibility | Efforts have been made to make perl and the core XS modules compilable with various C++ compilers (although the situation is not perfect with some of the compilers on some of the platforms tested.) |
Support for Microsoft 64-bit compiler | Support for building perl with Microsofts 64-bit compiler has been improved. (ActiveState) |
Visual C++ | Perl can now be compiled with Microsoft Visual C++ 2005 (and 2008 Beta 2). |
Win32 builds | All win32 builds (MS-Win, WinCE) have been merged and cleaned up. |
Installation improvements
Module auxiliary files | README files and changelogs for CPAN modules bundled with perl are no longer installed. |
New Or Improved Platforms
Perl has been reported to work on Symbian OS. See perlsymbian for more information.
Many improvements have been made towards making Perl work correctly on z/OS.
Perl has been reported to work on DragonFlyBSD and MidnightBSD.
Perl has also been reported to work on NexentaOS ( http://www.gnusolaris.org/ ).
The VMS port has been improved. See perlvms.
Support for Cray XT4 Catamount/Qk has been added. See hints/catamount.sh in the source code distribution for more information.
Vendor patches have been merged for RedHat and Gentoo.
DynaLoader::dl_unload_file() now works on Windows.
Selected Bug Fixes
strictures in regexp-eval blocks |
strictwasnt in effect in regexp-eval blocks ( /(?{...})/). |
Calling CORE::require() | CORE::require() and CORE::do() were always parsed as require() and do() when they were overridden. This is now fixed. |
Subscripts of slices |
You can now use a non-arrowed form for chained subscripts after a list
slice, like in:
({foo => "bar"})[0]{foo}
This used to be a syntax error; a ->was required. |
no warnings categoryworks correctly with -w |
Previously when running with warnings enabled globally via -w, selective disabling of specific warning categories would actually turn off all warnings. This is now fixed; now no warnings io;will only turn off warnings in the ioclass. Previously it would erroneously turn off all warnings. |
threads improvements |
Several memory leaks in ithreads were closed. Also, ithreads were made
less memory-intensive.
threadsis now a dual-life module, also available on CPAN. It has been expanded in many ways. A kill() method is available for thread signalling. One can get thread status, or the list of running or joinable threads. A new threads->exit()method is used to exit from the application (this is the default for the main thread) or from the current thread only (this is the default for all other threads). On the other hand, the exit() built-in now always causes the whole application to terminate. (Jerry D. Hedden) |
chr() and negative values |
chr() on a negative value now gives \x{FFFD}, the Unicode replacement character, unless when the bytespragma is in effect, where the low eight bits of the value are used. |
PERL5SHELL and tainting | On Windows, the PERL5SHELL environment variable is now checked for taintedness. (Rafael Garcia-Suarez) |
Using *FILE{IO} |
stat()and -Xfiletests now treat *FILE{IO} filehandles like *FILE filehandles. (Steve Peters) |
Overloading and reblessing | Overloading now works when references are reblessed into another class. Internally, this has been implemented by moving the flag for overloading from the reference to the referent, which logically is where it should always have been. (Nicholas Clark) |
Overloading and UTF-8 | A few bugs related to UTF-8 handling with objects that have stringification overloaded have been fixed. (Nicholas Clark) |
eval memory leaks fixed |
Traditionally, eval syntax errorhas leaked badly. Many (but not all) of these leaks have now been eliminated or reduced. (Dave Mitchell) |
Random device on Windows | In previous versions, perl would read the file /dev/urandom if it existed when seeding its random number generator. That file is unlikely to exist on Windows, and if it did would probably not contain appropriate data, so perl no longer tries to read it on Windows. (Alex Davies) |
PERLIO_DEBUG |
The PERLIO_DEBUGenvironment variable no longer has any effect for setuid scripts and for scripts run with -T. Moreover, with a thread-enabled perl, using PERLIO_DEBUGcould lead to an internal buffer overflow. This has been fixed. |
PerlIO::scalar and read-only scalars | PerlIO::scalar will now prevent writing to read-only scalars. Moreover, seek() is now supported with PerlIO::scalar-based filehandles, the underlying string being zero-filled as needed. (Rafael, Jarkko Hietaniemi) |
study() and UTF-8 | study() never worked for UTF-8 strings, but could lead to false results. Its now a no-op on UTF-8 data. (Yves Orton) |
Critical signals | The signals SIGILL, SIGBUS and SIGSEGV are now always delivered in an unsafe manner (contrary to other signals, that are deferred until the perl interpreter reaches a reasonably stable state; see Deferred Signals (Safe Signals) in perlipc). (Rafael) |
@INC-hook fix |
When a module or a file is loaded through an @INC-hook, and when this hook has set a filename entry in %INC, __FILE__ is now set for this module accordingly to the contents of that %INCentry. (Rafael) |
-tswitch fix |
The -wand -tswitches can now be used together without messing up which categories of warnings are activated. (Rafael) |
Duping UTF-8 filehandles |
Duping a filehandle which has the :utf8PerlIO layer set will now properly carry that layer on the duped filehandle. (Rafael) |
Localisation of hash elements |
Localizing a hash element whose key was given as a variable didnt work
correctly if the variable was changed while the local() was in effect (as
in local $h{$x}; ++$x). (Bo Lindbergh) |
New or Changed Diagnostics
Use of uninitialized value | Perl will now try to tell you the name of the variable (if any) that was undefined. |
Deprecated use of my() in false conditional |
A new deprecation warning, Deprecated use of my() in false conditional,
has been added, to warn against the use of the dubious and deprecated
construct
my $x if 0;
See perldiag. Use statevariables instead. |
!=~ should be !~ |
A new warning, !=~ should be !~, is emitted to prevent this misspelling of the non-matching operator. |
Newline in left-justified string | The warning Newline in left-justified string has been removed. |
Too late for -T option | The error Too late for -T option has been reformulated to be more descriptive. |
%s variable %smasks earlier declaration |
This warning is now emitted in more consistent cases; in short, when one
of the declarations involved is a myvariable:
my $x; my $x; # warns
my $x; our $x; # warns
our $x; my $x; # warns
On the other hand, the following:
our $x; our $x;
now gives a "our" variable %s redeclaredwarning. |
readdir()/closedir()/etc. attempted on invalid dirhandle | These new warnings are now emitted when a dirhandle is used but is either closed or not really a dirhandle. |
Opening dirhandle/filehandle %salso as a file/directory |
Two deprecation warnings have been added: (Rafael)
Opening dirhandle %s also as a file
Opening filehandle %s also as a directory
|
Use of -P is deprecated |
Perls command-line switch -Pis now deprecated. |
v-string in use/require is non-portable |
Perl will warn you against potential backwards compatibility problems with
the use VERSIONsyntax. |
perl -V |
perl -Vhas several improvements, making it more useable from shell scripts to get the value of configuration variables. See perlrun for details. |
Changed Internals
In general, the source code of perl has been refactored, tidied up, and optimized in many places. Also, memory management and allocation has been improved in several points.
When compiling the perl core with gcc, as many gcc warning flags are turned on as is possible on the platform. (This quest for cleanliness doesnt extend to XS code because we cannot guarantee the tidiness of code we didnt write.) Similar strictness flags have been added or tightened for various other C compilers.
Reordering of SVt_* constants
The relative ordering of constants that define the various types of
SVhave changed; in particular,
SVt_PVGVhas been moved before
SVt_PVLV,
SVt_PVAV,
SVt_PVHVand
SVt_PVCV. This is unlikely to make any difference unless you have code that explicitly makes assumptions about that ordering. (The inheritance hierarchy of
B::*objects has been changed to reflect this.)
Elimination of SVt_PVBM
Related to this, the internal type
SVt_PVBMhas been been removed. This dedicated type of
SVwas used by the
indexoperator and parts of the regexp engine to facilitate fast Boyer-Moore matches. Its use internally has been replaced by
SVs of type
SVt_PVGV.
New type SVt_BIND
A new type
SVt_BINDhas been added, in readiness for the project to implement Perl 6 on 5. There deliberately is no implementation yet, and they cannot yet be created or destroyed.
Removal of \s-1CPP\s0 symbols
The C preprocessor symbols
PERL_PM_APIVERSIONand
PERL_XS_APIVERSION, which were supposed to give the version number of the oldest perl binary-compatible (resp. source-compatible) with the present one, were not used, and sometimes had misleading values. They have been removed.
Less space is used by ops
The
BASEOPstructure now uses less space. The
op_seqfield has been removed and replaced by a single bit bit-field
op_opt.
op_typeis now 9 bits long. (Consequently, the
B::OPclass doesnt provide an
seqmethod anymore.)
New parser
perls parser is now generated by bison (it used to be generated by byacc.) As a result, it seems to be a bit more robust.
Also, Dave Mitchell improved the lexer debugging output under
-DT.
Use of \f(CWconst\fP
Andy Lester supplied many improvements to determine which function parameters and local variables could actually be declared
constto the C compiler. Steve Peters provided new
*_setmacros and reworked the core to use these rather than assigning to macros in LVALUE context.
Mathoms
A new file, mathoms.c, has been added. It contains functions that are no longer used in the perl core, but that remain available for binary or source compatibility reasons. However, those functions will not be compiled in if you add
-DNO_MATHOMSin the compiler flags.
\f(CWAvFLAGS\fP has been removed
The
AvFLAGSmacro has been removed.
\f(CWav_*\fP changes
The
av_*()functions, used to manipulate arrays, no longer accept null
AV*parameters.
$^H and %^H
The implementation of the special variables $^H and %^H has changed, to allow implementing lexical pragmas in pure Perl.
B:: modules inheritance changed
The inheritance hierarchy of
B::modules has changed;
B::NVnow inherits from
B::SV(it used to inherit from
B::IV).
Anonymous hash and array constructors
The anonymous hash and array constructors now take 1 op in the optree instead of 3, now that pp_anonhash and pp_anonlist return a reference to an hash/array when the op is flagged with OPf_SPECIAL. (Nicholas Clark)
Known Problems
Theres still a remaining problem in the implementation of the lexical
$_: it doesnt work inside
/(?{...})/blocks. (See the TODO test in t/op/mydef.t.)
Stacked filetest operators wont work when the
filetestpragma is in effect, because they rely on the stat() buffer
_being populated, and filetest bypasses stat().
\s-1UTF\-8\s0 problems
The handling of Unicode still is unclean in several places, where its dependent on whether a string is internally flagged as UTF-8. This will be made more consistent in perl 5.12, but that wont be possible without a certain amount of backwards incompatibility.
Platform Specific Problems
When compiled with g++ and thread support on Linux, its reported that the
$!stops working correctly. This is related to the fact that the glibc provides two strerror_r(3) implementation, and perl selects the wrong one.
Reporting Bugs
If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/rt3/ . There may also be information at http://www.perl.org/ , the Perl Home Page.
If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of
perl -V, will be sent off to perlbug@perl.org to be analysed by the Perl porting team.
SEE ALSO
The Changes file and the perl590delta to perl595delta man pages for exhaustive details on what changed.
The INSTALL file for how to build Perl.
The README file for general stuff.
The Artistic and Copying files for copyright information.